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NOVEL HUMAN GENES AND GENE EXPRESSION PRODUCTS 

FIELD OF THE INVENTION 

The present invention relates to novel polynucleotides of human origin 
and the encoded gene products. 

5 BACKGROUND OF THE INVENTION 

Identification of novel polynucleotides, particularly those that encode an 
expressed gene product, is important in the advancement of drug discovery, diagnostic 
technologies, and the understanding of the progression and nature of complex diseases 
such as cancer. Identification of genes expressed in different cell types isolated from 
10 sources that differ in disease state or stage, developmental stage, exposure to various 
environmental factors, the tissue of origin, the species from which the tissue was 
isolated, and the like is key to identifying the genetic factors that are responsible for the 
phenotypes associated with these various differences. 

This invention provides novel human polynucleotides, the polypeptides 
15 encoded by these polynucleotides, and the genes and proteins corresponding to these 
novel polynucleotides. 

SUMMARY OF THE INVENTION 

This invention relates to novel human polynucleotides and variants 
thereof, their encoded polypeptides and variants thereof, to genes corresponding to these 

20 polynucleotides and to proteins expressed by the genes. The invention also relates to 
diagnostics and therapeutics comprising such novel human polynucleotides, their 
corresponding genes or gene products, including probes, antisense nucleotides, and 
antibodies. The polynucleotides of the invention correspond to a polynucleotide 
comprising the sequence information of at least one of SEQ ID NOs: 1-3351 . 

25 Various aspects and embodiments of the invention will be readily 

apparent to the ordinarily skilled artisan upon reading the description provided herein. 

DETAILED DESCRIPTION OF THE INVENTION 

The invention relates to polynucleotides comprising the disclosed 
nucleotide sequences, to full length cDNA, mRNA genomic sequences, and genes 
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corresponding to these sequences and degenerate variants thereof, and to polypeptides 
encoded by the polynucleotides of the invention and polypeptide variants. 

Polypeptide variants differ from wild type protein in having one or more 
amino acid substitutions that either enhance, add, or diminish a biological activity of the 
5 wild type protein. 

Six of the polypeptides disclosed herein encode new members of the MKK 
kinase family; the coding region is found within the nucleotide region in parentheses: SEQ 
ID NO:29 (nucleotides 295-421); SEQ ID NO:31 (298-397); SEQ ID NO: 196 (37-322); 
SEQ ID NO:3175 (nucleotides 14-164); SEQ ID NO:3190 (229-390); and SEQ ID 

10 NO:3281 (15-182). Twenty-four of the polypeptides encode new members of the family 
of transcription factor proteins having a basic region plus leucine zipper: SEQ ID NO:410 
(42-191); SEQ ID NO:552 (116-288); SEQ ID NO:768 (116-288); SEQ ID NO:822 (108- 
262); SEQ ID NO:836 (158-353); SEQ ID NO: 1288 (73-234); SEQ ID NO: 1365 (69-257); 
SEQ ID NO: 1540 (289-471); SEQ ID NO: 1549 (200-391); SEQ ID NO: 1556 (163-354); 

15 SEQ ID NO:1557 (207-398); SEQ ID NO:I563 (107-298); SEQ ID NO:1622 (180-365); 
SEQ ID NO: 1630 (100-291); SEQ ID NO: 1704 (184-372); SEQ ID NO: 1808 (36-161); 
SEQ ID NO:l454 (49-209); SEQ ID NO:2363 (48-211); SEQ ID NO:2424 (43-194); 
SEQ ID NO:3147 (190-369); SEQ ID NO:3152 (129-320); SEQ ID NO:3158 (167- 
334); and SEQ ID NO:3208 (34-256). 

20 SEQ ID NOs:186 (175-395); 2591 (60-165); 3307 (43-321); and 3339 

(94-342) encode polypeptides having an SH2 domain, and SEQ ID NOs:234 (23-121), 
1832 (18-173), and 1835 (57-206) encode polypeptides having an SH3 domain. Nine 
polypeptides encode new members of the family of proteins having Ank repeat regions: 
SEQ ID NO: 187 (358-432); SEQ ID NO: 1268 (238-315); SEQ ID NO: 1804 (301-378); 

25 SEQ ID NO:1819 (278-355); SEQ ID NO:1839 (224-307); SEQ ID NO:1830 (184-267); 
SEQ ID NO:2562 (18-101); SEQ ID NO:3015 (131-214); and SEQ ID NO:3267 (97- 
180). 

The following eleven polynucleotides encode polypeptides having a C2H2 
type zinc finger: SEQ ID NOs:308 (1 10-172); 807 (339-392); 1324 (294-356); 1503 (154- 

30 216); 1527 (156-212); 1674 (196-258); 1779 (64-126); 1801 (295-351); 3081 (190-252); 
3193 (293-355); and 3306 (161-223). Eight polynucleotides encode polypeptides of the 
family of ATPases: SEQ ID NOs:431 (71-428); 639 (157-561); 2135 (2-401); 2684 (9- 
461); 2859 (100-320); 3178 (45-386); 3197 (281-343) and 3266 (8-139). Polypeptides 
having a fibronectin type III domain are encoded by SEQ ID NO:746 (209-427) and 1 192 

35 (1 86-416). Polypeptides having an EF-hand domain are encoded by SEQ ID NO:820 (341- 
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406); 1755 (281-367) and 3285(16-102). Six polypeptides of the protein kinase family are 
encoded by SEQ ID NOs:l 157 (41-444); 1478 (54-437), 1496 (241-520); 2286 (12-182); 
2969 (5-387); and 3 190 (1 1 8-390). 

LIM domain-containing polypeptides are encoded by SEQ ID NO: 1269 
5 (79-240); 1309 (248-404); 1360 (222-377); and 1386 (243-398). Two polypeptides of the 
family having a C2 domain (protein kinase C-like) are encoded by SEQ ID NO: 1325 (1- 
234) and 2282(183-353). Polypeptides having a WD domain, G-beta repeat motif are 
encoded by SEQ ID NOs:1336 (66-164); 1380 (42-140); 171 1 (263-361); 1762 (236-334); 
1909 (160-258); 2218 (127-225); 3047 (191-292); 3108 (275-367) and 3292 (208-300). 

10 SEQ ID NO: 1410 (222-350) encodes a member of the trypsin family. SEQ 

ID NOs:1417 (8-354); 2281 (20-387) and 2310 (20-371) encode members of the protein 
tyrosine phosphatase family. SEQ ID NOs:1464 (4-180) and 1514 (2-252) encode 
members of the family having an RNA recognition motif (also known as RRM, RBD, or 
RNP domain). SEQ ID NOs:1496 (241-520) and 3297(7-153) encode helicases having a 

15 conserved C-terminal domain. SEQ ID NO: 1538 (9-635) encodes a member of the wnt 
family of developmental signaling proteins. 

Three polynucleotides encode polypeptides having a homeobox domain: 
SEQ ID NOs:1676 (9-86); 1820 (123-299); and 1821 (127-303). A novel thioredoxin is 
encoded by SEQ ID NO: 1677 (316-369). Two novel members of the ras family are 

20 encoded by SEQ ID NO: 1688(1 09-4 10) and 3258(138-394). A novel polypeptide having a 
phosphatidylinositol-specific phospholipase C Y-domain is encoded by SEQ ID NO: 1707 
(92-439). A novel serine carboxypeptidase is encoded by SEQ ID NO: 1744 (238-433). A 
novel polypeptide having N-terminal homology in the Ets domain is encoded by SEQ ID 
NO:181 1 (184-315). A novel polypeptide having a bromodomain is encoded by SEQ ID 

25 NO: 1814 (127-294). A novel polypeptide having a double-stranded RNA binding motif is 
encoded by SEQ ID NO:1818 (9-146). A novel polypeptide having a G-protein alpha 
subunit is encoded by SEQ ID NO: 1846 (12-398). 

SEQ ID NOs:1911 (35-151) and 1980 (60-197) encode polypeptides 
having a C3HC4 type zinc finger domain (RING finger). SEQ ID NO:2065 (253-306) 

30 encodes a polypeptide having a CCHC zinc finger domain. SEQ ID NO:2216 (90-179) 
encodes a polypeptide having a WW/rsp5/WWP domain. SEQ ID NO:2428 (25-350) 
encodes a polypeptide member of the dual specificity phosphatase family, having a 
catalytic domain. 

SEQ ID NOs:2577 (0-311); 3183 (14-215); and 3195 (0-215) encode 
35 members of the 4 transmembrane segment integral membrane protein family. SEQ ID 
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NOs:2826 (1 16-400) and 2871 (198-392) encode polypeptides of the DEAD and DEAH 
box helicase family. SEQ ID NO:2944 (18-281) encodes a polypeptide having a 
calpain large subunit, domain III. 

SEQ ID NO:3274 (11-187) encodes a eukaryotic transcription factor 
5 with a fork head domain. SEQ ID NO: 3 345 (65-271) encodes a polypeptide having a 
PDZ domain, and SEQ ID NO:3351 (124-270) encodes a polypeptide in the family of 
phorbol esters/glycerol binding proteins. 

Described below are polynucleotide compositions encompassed by the 
invention, methods for obtaining cDNA or genomic DNA encoding a full-length gene 

10 product, expression of these polynucleotides and genes, identification of structural motifs 
of the polynucleotides and genes, identification of the function of a gene product encoded 
by a gene corresponding to a polynucleotide of the invention, use of the provided 
polynucleotides as probes and in mapping and in tissue profiling, use of the corresponding 
polypeptides and other gene products to raise antibodies, and use of the polynucleotides 

1 5 and their encoded gene products for therapeutic and diagnostic purposes. 

Polynucleotide Compositions 

The scope of the invention with respect to polynucleotide compositions 
includes, but is not necessarily limited to, polynucleotides having a sequence set forth in 
any one of SEQ ID NOs: 1-3351; polynucleotides obtained from the biological materials 

20 described herein or other biological sources (particularly human sources) by 
hybridization under stringent conditions (particularly conditions of high stringency); 
genes corresponding to the provided polynucleotides; variants of the provided 
polynucleotides and their corresponding genes, particularly those variants that retain a 
biological activity of the encoded gene product (e.g., a biological activity ascribed to a 

25 gene product corresponding to the provided polynucleotides as a result of the 
assignment of the gene product to a protein family(ies) and/or identification of a 
functional domain present in the gene product). Other nucleic acid compositions 
contemplated by and within the scope of the present invention will be readily apparent 
to one of ordinary skill in the art when provided with the disclosure here. 

30 "Polynucleotide" and "nucleic acid" as used herein with reference to nucleic acids of 
the composition is not intended to be limiting as to the length or structure of the nucleic 
acid unless specifically indicated. 

The invention features polynucleotides that are expressed in human 
tissue, specifically human colon, breast, and/or lung tissue. Novel nucleic acid 
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compositions of the invention comprise a sequence set forth in any one of SEQ ID 
NOs: 1-3351 or an identifying sequence thereof. An "identifying sequence" is a 
contiguous sequence of residues at least about 1 0 nt to about 20 nt in length, usually at 
least about 50 nt to about 100 nt in length, that uniquely identifies a polynucleotide 
5 sequence, e.g., exhibits less than 90%, usually less than about 80% to about 85% 
sequence identity to any contiguous nucleotide sequence of more than about 20 nt. 
Thus, the subject novel nucleic acid compositions include full length cDNAs or mRNAs 
that encompass an identifying sequence of contiguous nucleotides from any one of SEQ 
ID NOs:l-3351. 

10 The polynucleotides of the invention also include polynucleotides having 

sequence similarity or sequence identity. Nucleic acids having sequence similarity are 
detected by hybridization under low stringency conditions, for example, at 50°C and 
10XSSC (0.9 M saline/0.09 M sodium citrate) and remain bound when subjected to 
washing at 55°C in 1XSSC. Sequence identity can be determined by hybridization 

15 under stringent conditions, for example, at 50°C or higher and 0.1XSSC (9 mM 
saline/0.9 mM sodium citrate). Hybridization methods and conditions are well known 
in the art, see, e.g., U.S. Patent No. 5,707,829. Nucleic acids that are substantially 
identical to the provided polynucleotide sequences, e.g., allelic variants, genetically 
altered versions of the gene, etc., bind to the provided polynucleotide sequences (SEQ 

20 ID NOs: 1-3351) under stringent hybridization conditions. By using probes, particularly 
labeled probes of DNA sequences, one can isolate homologous or related genes. The 
source of homologous genes can be any species, e.g., primate species, particularly 
human; rodents, such as rats and mice; canines, felines, bovines, ovines, equines, yeast, 
nematodes, etc. 

25 Preferably, hybridization is performed using at least 15 contiguous 

nucleotides (nt) of at least one of SEQ ID NOs: 1-3351. That is, when at least 15 
contiguous nt of one of the disclosed SEQ ID NOs. is used as a probe, the probe will 
preferentially hybridize with a nucleic acid comprising the complementary sequence, 
allowing the identification and retrieval of the nucleic acids that uniquely hybridize to 

30 the selected probe. Probes from more than one SEQ ID NO. can hybridize with the 
same nucleic acid if the cDNA from which they were derived corresponds to one 
mRNA. Probes of more than 15 nt can be used, e.g., probes of from about 18 nt to 
about 100 nt, but 15 nt represents sufficient sequence for unique identification. 

The polynucleotides of the invention also include naturally occurring 

35 variants of the nucleotide sequences (e.g., degenerate variants, allelic variants). 
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Variants of the polynucleotides of the invention are identified by hybridization of 
putative variants with nucleotide sequences disclosed herein, preferably by 
hybridization under stringent conditions. For example, by using appropriate wash 
conditions, variants of the polynucleotides of the invention can be identified where the 
5 allelic variant exhibits at most about 25-30% base pair (bp) mismatches relative to the 
selected polynucleotide probe. In general, allelic variants contain 15-25% bp 
mismatches, and can contain as little as even 5-15%, or 2-5%, or 1-2% bp mismatches, 
as well as a single bp mismatch. 

The invention also encompasses homologs corresponding to the 

10 polynucleotides of SEQ ID NOs: 1-3351, where the source of homologous genes can be 
any mammalian species, e.g., primate species, particularly human; rodents, such as rats; 
canines, felines, bovines, o vines, equines, yeast, nematodes, etc. Between mammalian 
species, e.g., human and mouse, homologs generally have substantial sequence 
similarity, e.g., at least 75% sequence identity, usually at least 90%, more usually at 

1 5 least 95% between nucleotide sequences. Sequence similarity is calculated based on a 
reference sequence, which may be a subset of a larger sequence, such as a conserved 
motif, coding region, flanking region, etc. A reference sequence will usually be at least 
about 18 contiguous nt long, more usually at least about 30 nt long, and may extend to 
the complete sequence that is being compared. Algorithms for sequence analysis are 

20 known in the art, such as BLAST, described in Altschul et al., J. Mol. Biol (1990) 
275:403-10. 

In general, variants of the invention have a sequence identity greater than 
at least about 65%, preferably at least about 75%, more preferably at least about 85%, 
and can be greater than at least about 90%, 91%, 92%, 93%, 94%, 95%, or 96%, most 

25 preferably 97%, 98% or 99%. For the purposes of this invention, a preferred method of 
calculating percent identity is the Smith- Waterman algorithm, using the following. 
Global DNA sequence identity must be greater than 65% as determined by the Smith- 
Waterman homology search algorithm as implemented in MPSRCH program (Oxford 
Molecular) using an affine gap search with the following search parameters: gap open 

30 penalty, 12; and gap extension penalty, 1. 

The subject nucleic acids can be cDNAs or genomic DNAs, as well as 
fragments thereof, particularly fragments that encode a biologically active gene product 
and/or are useful in the methods disclosed herein {e.g., in diagnosis, as a unique 
identifier of a differentially expressed gene of interest, etc.). The term "cDNA" as used 

35 herein is intended to include all nucleic acids that share the arrangement of sequence 
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elements found in native mature mRNA species, where sequence elements are exons 
and 3' and 5' non-coding regions. Normally mRNA species have contiguous exons, 
with the intervening introns, when present, being removed by nuclear RNA splicing, to 
create a continuous open reading frame encoding a polypeptide of the invention. 
5 A genomic sequence of interest comprises the nucleic acid present 

between the initiation codon and the stop codon, as defined in the listed sequences, 
including all of the introns that are normally present in a native chromosome. It can 
further include the 3' and 5' untranslated regions found in the mature mRNA. It can 
further include specific transcriptional and translational regulatory sequences, such as 

10 promoters, enhancers, etc., including about 1 kb, but possibly more, of flanking 
genomic DNA at either the 5' and 3' end of the transcribed region. The genomic DNA 
can be isolated as a fragment of 100 kbp or smaller; and substantially free of flanking 
chromosomal sequence. The genomic DNA flanking the coding region, either 3' and 
5', or internal regulatory sequences as sometimes found in introns, contains sequences 

15 required for proper tissue, stage-specific, or disease-state specific expression. 

The nucleic acid compositions of the subject invention can encode all or 
a part of the subject polypeptides. Double or single stranded fragments can be obtained 
from the DNA sequence by chemically synthesizing oligonucleotides in accordance 
with conventional methods, by restriction enzyme digestion, by PCR amplification, etc. 

20 Isolated polynucleotides and polynucleotide fragments of the invention comprise at 
least about 10, about 15, about 20, about 35, about 50, about 100, about 150 to about 
200, about 250 to about 300, or about 350 contiguous nt selected from the 
polynucleotide sequences as shown in SEQ ID NOs: 1-3351. The fragments also 
include those of lengths intermediate to the specifically mentioned lengths, such as 35, 

25 36, 37, 38, 39, etc.; 150, 151, 152, 153, 154, etc. For the most part, fragments will be of 
at least 15 nt, usually at least 18 nt or 25 nt, and up to at least about 50 contiguous nt in 
length or more. In a preferred embodiment, the polynucleotide molecules comprise a 
contiguous sequence of at least 12 nt selected from the group consisting of the 
polynucleotides shown in SEQ ID NOs: 1-3351. 

30 Probes specific to the polynucleotides of the invention can be generated 

using the polynucleotide sequences disclosed in SEQ ID NOs: 1-3351. The probes are 
preferably at least about a 12, 15, 16, 18, 20, 22, 24, or 25 nt fragment of a 
corresponding contiguous sequence of SEQ ID NOs: 1-3351, and can be less than 2, 1, 
0.5, 0.1, or 0.05 kb in length. The probes can be synthesized chemically or can be 

35 generated from longer polynucleotides using restriction enzymes. The probes can be 

7 
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labeled, for example, with a radioactive, biotinylated, or fluorescent tag. Preferably, 
probes are designed based upon an identifying sequence of a polynucleotide of one of 
SEQ ID NOs: 1-3351. More preferably, probes are designed based on a contiguous 
sequence of one of the subject polynucleotides that remain unmasked following 
5 application of a masking program for masking low complexity (e.g., XBLAST) to the 
sequence., i.e., one would select an unmasked region, as indicated by the 
polynucleotides outside the poly-n stretches of the masked sequence produced by the 
masking program. 

The polynucleotides of the subject invention are isolated and obtained in 
10 substantial purity, generally as other than an intact chromosome. Usually, the 
polynucleotides, either as DNA or RNA, will be obtained substantially free of other 
naturally-occurring nucleic acid sequences, generally being at least about 50%, usually 
at least about 90% pure and are typically "recombinant", e.g., flanked by one or more 
nucleotides with which it is not normally associated on a naturally occurring 
1 5 chromosome. 

The polynucleotides of the invention can be provided as a linear 
molecule or within a circular molecule, and can be provided within autonomously 
replicating molecules (vectors) or within molecules without replication sequences. 
Expression of the polynucleotides can be regulated by their own or by other regulatory 

20 sequences known in the art. The polynucleotides of the invention can be introduced 
into suitable host cells using a variety of techniques available in the art, such as 
transferrin poly cation-mediated DNA transfer, transfection with naked or encapsulated 
nucleic acids, liposome-mediated DNA transfer, intracellular transportation of DNA- 
coated latex beads, protoplast fusion, viral infection, electroporation, gene gun, calcium 

25 phosphate-mediated transfection, and the like. 

The subject nucleic acid compositions can be used to, for example, 
produce polypeptides, as probes for the detection of mRNA of the invention in 
biological samples (e.g. , extracts of human cells) to generate additional copies of the 
polynucleotides, to generate ribozymes or antisense oligonucleotides, and as single 

30 stranded DNA probes or as triple-strand forming oligonucleotides. The probes 
described herein can be used to, for example, determine the presence or absence of the 
polynucleotide sequences as shown in SEQ ID NOs: 1-3351 or variants thereof in a 
sample. These and other uses are described in more detail below. 
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Use of Polynucleotides to Obtain Full-Length cDNA. Gene, and Promoter Region 

Full-length cDNA molecules comprising the disclosed polynucleotides 
are obtained as follows. A polynucleotide having a sequence of one of SEQ ID NOs:l- 
3351, or a portion thereof comprising at least 12, 15, 18, or 20 nt, is used as a 
5 hybridization probe to detect hybridizing members of a cDNA library using probe 
design methods, cloning methods, and clone selection techniques such as those 
described in U.S. Patent No. 5,654,173. Libraries of cDNA are made from selected 
tissues, such as normal or tumor tissue, or from tissues of a mammal treated with, for 
example, a pharmaceutical agent. Preferably, the tissue is the same as the tissue from 

10 which the polynucleotides of the invention were isolated, as both the polynucleotides 
described herein and the cDNA represent expressed genes. Most preferably, the cDNA 
library is made from the biological material described herein in the Examples. The 
choice of cell type for library construction can be made after the identity of the protein 
encoded by the gene corresponding to the polynucleotide of the invention is known. 

15 This will indicate which tissue and cell types are likely to express the related gene, and 
thus represent a suitable source for the mRNA for generating the cDNA. As described 
in the Examples, cDNA of the invention was isolated from specific cell or tissue types, 
and such cells and tissues are preferable for obtaining related nucleic acids. 

Techniques for producing and probing nucleic acid sequence libraries are 

20 described, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual 
2nd Ed, (1989) Cold Spring Harbor Press, Cold Spring Harbor, NY. The cDNA can be 
prepared by using primers based on sequence from SEQ ID NOs: 1-3351. In one 
embodiment, the cDNA library can be made from only poly-adenylated mRNA. Thus, 
poly-T primers can be used to prepare cDNA from the mRNA. 

25 Members of the library that are larger than the provided polynucleotides, 

and preferably that encompass the complete coding sequence of the native message, are 
obtained. In order to confirm that the entire cDNA has been obtained, RNA protection 
experiments are performed as follows. Hybridization of a full-length cDNA to an 
mRNA will protect the RNA from RNase degradation. If the cDNA is not full length, 

30 then the portions of the mRNA that are not hybridized will be subject to RNase 
degradation. This is assayed, as is known in the art, by changes in electrophoretic 
mobility on polyacrylamide gels, or by detection of released monoribonucleotides. 
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., (1989) Cold 
Spring Harbor Press, Cold Spring Harbor, NY. In order to obtain additional sequences 
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5' to the end of a partial cDNA, 5' RACE (PCR Protocols: A Guide to Methods and 
Applications, (1990) Academic Press, Inc.) can be performed. 

Genomic DNA is isolated using the provided polynucleotides in a 
manner similar to the isolation of full-length cDNAs. Briefly, the provided 
5 polynucleotides, or portions thereof, are used as probes to libraries of genomic DNA. 
Preferably, the library is obtained from the cell type that was used to generate the 
polynucleotides of the invention, but this is not essential. Most preferably, the genomic 
DNA is obtained from the biological material described herein in the Examples. Such 
libraries can be in vectors suitable for carrying large segments of a genome, such as PI 

10 or YAC, as described in detail in Sambrook et al., 9.4-9.30. In addition, genomic 
sequences can be isolated from human BAC libraries, which are commercially available 
from Research Genetics, Inc., Huntsville, Alabama, USA, for example. In order to 
obtain additional 5' or 3* sequences, chromosome walking is performed, as described in 
Sambrook et ah, such that adjacent and overlapping fragments of genomic DNA are 

1 5 isolated. These are mapped and pieced together, as is known in the art, using restriction 
digestion enzymes and DNA ligase. 

Using the polynucleotide sequences of the invention, corresponding full- 
length genes can be isolated using both classical and PCR methods to construct and 
probe cDNA libraries. Using either method, Northern blots, preferably, are performed 

20 on a number of cell types to determine which cell lines express the gene of interest at 
the highest level. Classical methods of constructing cDNA libraries are taught in 
Sambrook et al., supra. With these methods, cDNA can be produced from mRNA and 
inserted into viral or expression vectors. Typically, libraries of mRNA comprising 
poly(A) tails can be produced with poly(T) primers. Similarly, cDNA libraries can be 

25 produced using the instant sequences as primers. 

PCR methods are used to amplify the members of a cDNA library that 
comprise the desired insert. In this case, the desired insert will contain sequence from 
the full length cDNA that corresponds to the instant polynucleotides. Such PCR 
methods include gene trapping and RACE methods as described in Gruber et al., WO 

30 95/04745 and Gruber et aL, U.S. Patent No. 5,500,356. Kits are commercially available 
to perform gene trapping experiments from, for example, Life Technologies, 
Gaithersburg, Maryland, USA. In preferred embodiments of RACE, a common primer 
is designed to anneal to an arbitrary adaptor sequence ligated to cDNA ends (Apte and 
Siebert, Biotechniques (1993) 75:890-893; Edwards et al., Nuc. Acids Res. (1991) 

35 79:5227-5232). When a single gene-specific RACE primer is paired with the common 

to 
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primer, preferential amplification of sequences between the single gene specific primer 
and the common primer occurs. Commercial cDNA pools modified for use in RACE 
are available. 

The promoter region of a gene generally is located 5' to the initiation site 
5 for RNA polymerase II. Hundreds of promoter regions contain the "TATA" box, a 
sequence such as TATTA or TATAA, which is sensitive to mutations. The promoter 
region can be obtained by performing 5' RACE using a primer from the coding region 
of the gene. Alternatively, the cDNA can be used as a probe for the genomic sequence, 
and the region 5' to the coding region is identified by "walking up." If the gene is 
1 0 highly expressed or differentially expressed, the promoter from the gene can be of use 
in a regulatory construct for a heterologous gene. 

Once the full-length cDNA or gene is obtained, DNA encoding variants 
can be prepared by site-directed mutagenesis, described in detail in Sambrook et al., 
15.3-15.63. The choice of codon or nucleotide to be replaced can be based on disclosure 
15 herein on optional changes in amino acids to achieve altered protein structure and/or 
function. 

As an alternative method to obtaining DNA or RNA from a biological 
material, nucleic acid comprising nucleotides having the sequence of one or more 
polynucleotides of the invention can be synthesized. Thus, the invention encompasses 

20 nucleic acid molecules ranging in length from 15 nt (corresponding to at least 15 
contiguous nt of one of SEQ ID NOs: 1-3351) up to a maximum length suitable for one 
or more biological manipulations, including replication and expression, of the nucleic 
acid molecule. The invention includes but is not limited to (a) nucleic acid having the 
size of a full gene, and comprising at least one of SEQ ID NOs: 1-3351; (b) the nucleic 

25 acid of (a) also comprising at least one additional polynucleotide or gene, operably 
linked to permit expression of a fusion protein; (c) an expression vector comprising (a) 
or (b); (d) a plasmid comprising (a) or (b) ; and (e) a recombinant viral particle 
comprising (a) or (b). Once provided with the polynucleotides disclosed herein, 
construction or preparation of (a) - (e) are well within the skill in the art. 

30 The sequence of a nucleic acid comprising at least 15 contiguous nt of at 

least any one of SEQ ID NOs: 1-3351, preferably the entire sequence of at least any one 
of SEQ ID NOs: 1-3351, is not limited and can be any sequence of A, T, G, and/or C 
(for DNA) and A, U, G, and/or C (for RNA) or modified bases thereof, including 
inosine and pseudouridine. The choice of sequence will depend on the desired function 

35 and can be dictated by coding regions desired, the intron-like regions desired, and the 
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regulatory regions desired. Where the entire sequence of any one of SEQ ID NOs:l- 
3351 is within the nucleic acid, the nucleic acid obtained is referred to herein as a 
polynucleotide comprising the sequence of any one of SEQ ID NOs: 1 -335 1 . 

Expression of Polypeptide Encoded by Full-Length cDNA or Full-Length Gene 
5 The provided polynucleotides (e.g., a polynucleotide having a sequence 

of one of SEQ ID NOs:l-3351), the corresponding cDNA, or the full-length gene is 
used to express a partial or complete gene product. Constructs of polynucleotides 
having sequences of SEQ ID NOs: 1-3351 can be generated synthetically. Alternatively, 
single-step assembly of a gene and entire plasmid from large numbers of 

10 oligodeoxyribonucleotides is described by, e.g., Stemmer et al., Gene (Amsterdam) 
(1995) 7<54(7):49-53. In this method, assembly PCR (the synthesis of long DNA 
sequences from large numbers of oligodeoxyribonucleotides (oligos)) is described. The 
method is derived from DNA shuffling (Stemmer, Nature (1994) 570:389-391), and 
does not rely on DNA ligase, but instead relies on DNA polymerase to build 

1 5 increasingly longer DNA fragments during the assembly process. 

Appropriate polynucleotide constructs are purified using standard 
recombinant DNA techniques as described in, for example, Sambrook et al., Molecular 
Cloning: A Laboratory Manual, 2nd Ed., (1989) Cold Spring Harbor Press, Cold Spring 
Harbor, NY, and under current regulations described in United States Dept. of HHS, 

20 National Institute of Health (NIH) Guidelines for Recombinant DNA Research. The 
gene product encoded by a polynucleotide of the invention is expressed in any 
expression system, including, for example, bacterial, yeast, insect, amphibian and 
mammalian systems. Vectors, host cells and methods for obtaining expression in same 
are well known in the art. Suitable vectors and host cells are described in U.S. Patent 

25 No. 5,654,173. 

Polynucleotide molecules comprising a polynucleotide sequence 
provided herein are generally propagated by placing the molecule in a vector. Viral and 
non-viral vectors are used, including plasmids. The choice of plasmid will depend on 
the type of cell in which propagation is desired and the purpose of propagation. Certain 
30 vectors are useful for amplifying and making large amounts of the desired DNA 
sequence. Other vectors are suitable for expression in cells in culture. Still other 
vectors are suitable for transfer and expression in cells in a whole animal or person. The 
choice of appropriate vector is well within the skill of the art. Many such vectors are 
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available commercially. Methods for preparation of vectors comprising a desired 
sequence are well known in the art. 

The polynucleotides set forth in SEQ ID NOs: 1-3351 or their 
corresponding full-length polynucleotides are linked to regulatory sequences as 
5 appropriate to obtain the desired expression properties. These can include promoters 
(attached either at the 5' end of the sense strand or at the 3' end of the antisense strand), 
enhancers, terminators, operators, repressors, and inducers. The promoters can be 
regulated or constitutive. In some situations it may be desirable to use conditionally 
active promoters, such as tissue-specific or developmental stage-specific promoters. 

10 These are linked to the desired nucleotide sequence using the techniques described 
above for linkage to vectors. Any techniques known in the art can be used. 

When any appropriate host cells or organisms are used to replicate . 
and/or express the polynucleotides or nucleic acids of the invention, the resulting 
replicated nucleic acid, RNA, expressed protein or polypeptide, is within the scope of 

1 5 the invention as a product of the host cell or organism. The product is recovered by any 
appropriate means known in the art. 

Once the gene corresponding to a selected polynucleotide is identified, 
its expression can be regulated in the cell to which the gene is native. For example, an 
endogenous gene of a cell can be regulated by an exogenous regulatory sequence as 

20 disclosed in U.S. Patent No. 5,641,670. 

Identification of Functional and Structural Motifs of Novel Genes 

Translations of the nucleotide sequence of the provided polynucleotides, 

cDNAs or full genes can be aligned with individual known sequences. Similarity with 

individual sequences can be used to determine the activity of the polypeptides encoded 
25 by the polynucleotides of the invention. Also, sequences exhibiting similarity with 

more than one individual sequence can exhibit activities that are characteristic of either 

or both individual sequences. 

The full length sequences and fragments of the polynucleotide sequences 

of the nearest neighbors can be used as probes and primers to identify and isolate the 
30 full length sequence corresponding to provided polynucleotides. The nearest neighbors 

can indicate a tissue or cell type to be used to construct a library for the full-length 

sequences corresponding to the provided polynucleotides. 

Typically, a selected polynucleotide is translated in all six frames to 

determine the best alignment with the individual sequences. The sequences disclosed 
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herein in the Sequence Listing are in a 5' to 3' orientation and translation in three 
frames can be sufficient. These amino acid sequences are referred to, generally, as 
query sequences, which will be aligned with the individual sequences. Databases with 
individual sequences are described in "Computer Methods for Macromolecular 
5 Sequence Analysis" Methods in Enzymology (1996) 266, Doolittle, Academic Press, 
Inc., a division of Harcourt Brace & Co., San Diego, California, USA. Databases 
include Genbank, EMBL, and DNA Database of Japan (DDBJ). 

Query and individual sequences can be aligned using the methods and 
computer programs described above, and include BLAST, available over the world 

10 wide web at http://www.ncbi.nlm.nhi.gov/BLAST. Another alignment algorithm is 
Fasta, available in the Genetics Computing Group (GCG) package, Madison, 
Wisconsin, USA, a wholly owned subsidiary of Oxford Molecular Group, Inc. Other 
techniques for alignment are described in Doolittle, supra. Preferably, an alignment 
program that permits gaps in the sequence is utilized to align the sequences. The 

15 Smith- Waterman is one type of algorithm that permits gaps in sequence alignments. 
See Meth. Mol Biol (1997) 70: 173-187. Also, the GAP program using the Needleman 
and Wunsch alignment method can be utilized to align sequences. An alternative search 
strategy uses MPSRCH software, which runs on a MASPAR computer. MPSRCH uses 
a Smith- Waterman algorithm to score sequences on a massively parallel computer. 

20 This approach improves ability to identify sequences that are distantly related matches, 
and is especially tolerant of small gaps and nucleotide sequence errors. Amino acid 
sequences encoded by the provided polynucleotides can be used to search both protein 
and DNA databases. 

High Similarity . In general, in alignment results considered to be of high 

25 similarity, the percent of the alignment region length is typically at least about 55% of 
total length query sequence; more typically, at least about 58%; even more typically; at 
least about 60% of the total residue length of the query sequence. Usually, percent 
length of the alignment region can be as much as about 62%; more usually, as much as 
about 64%; even more usually, as much as about 66%. Further, for high similarity, the 

30 region of alignment, typically, exhibits at least about 75% of sequence identity; more 
typically, at least about 78%; even more typically; at least about 80% sequence identity. 
Usually, percent sequence identity can be as much as about 82%; more usually, as much 
as about 84%; even more usually, as much as about 86%. 

The p value is used in conjunction with these methods. If high similarity 

35 is found, the query sequence is considered to have high similarity with a profile 
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sequence when the p value is less than or equal to about 10" 2 ; more usually; less than or 
equal to about 10" 3 ; even more usually; less than or equal to about 10" 4 . More typically, 
the p value is no more than about 10" 5 ; more typically; no more than or equal to about 
10" 10 ; even more typically; no more than or equal to about 10" 15 for the query sequence 
5 to be considered high similarity. 

Similarity Determined by Sequence Identity Alone . Sequence identity 
alone can be used to determine similarity of a query sequence to an individual sequence 
and can indicate the activity of the sequence. Such an alignment, preferably, permits 
gaps to align sequences. Typically, the query sequence is related to the profile sequence 

10 if the sequence identity over the entire query sequence is at least about 15%; more 
typically, at least about 20%; even more typically, at least about 25%; even more 
typically, at least about 50%. Sequence identity alone as a measure of similarity is most 
useful when the query sequence is usually, at least 80 residues in length; more usually, 
90 residues; even more usually, at least 95 amino acid residues in length. More 

1 5 typically, similarity can be concluded based on sequence identity alone when the query 
sequence is preferably 100 residues in length; more preferably, 120 residues in length; 
even more preferably, 150 amino acid residues in length. 

Alignments with Profile and Multiple Aligned Sequences . Translations 
of the provided polynucleotides can be aligned with amino acid profiles that define 

20 either protein families or common motifs. Also, translations of the provided 
polynucleotides can be aligned to multiple sequence alignments (MSA) comprising the 
polypeptide sequences of members of protein families or motifs. Similarity or identity 
with profile sequences or MSAs can be used to determine the activity of the gene 
products (e.g., polypeptides) encoded by the provided polynucleotides or corresponding 

25 cDNA or genes. For example, sequences that show an identity or similarity with a 
chemokine profile or MSA can exhibit chemokine activities. 

Profiles can be designed manually by (1) creating an MSA, which is an 
alignment of the amino acid sequence of members that belong to the family and (2) 
constructing a statistical representation of the alignment. Such methods are described, 

30 for example, in Birney et al., Nucl Acid Res, (1996) 24(14): 2730-2739. MSAs of some 
protein families and motifs are publicly available. MSAs are described also in 
Sonnhammer et al., Proteins (1997) 28: 405-420. A brief description of MSAs is 
reported in Pascarella et al., ProL Eng. (1996) P(i):249-251. Techniques for building 
profiles from MSAs are described in Sonnhammer et al., supra; Birney et al., supra; 
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and "Computer Methods for Macromolecular Sequence Analysis," Methods in 
Enzymology (1996) 266, Doolittle, Academic Press, Inc., San Diego, California, USA. 

Similarity between a query sequence and a protein family or motif can be 
determined by (a) comparing the query sequence against the profile and/or (b) aligning 
5 the query sequence with the members of the family or motif. Typically, a program such 
as Searchwise is used to compare the query sequence to the statistical representation of 
the multiple alignment, also known as a profile (see Birney et al., supra). Other 
techniques to compare the sequence and profile are described in Sonnhammer et al., 
supra and Doolittle, supra. 

10 Next, methods described by Feng et al., J. MoL EvoL (1987) 25:351 and 

Higgins et al., CABIOS (1989) 5:151 can be used align the query sequence with the 
members of a family or motif, also known as a MSA. Sequence alignments can be 
generated using any of a variety of software tools. Examples include PileUp, which 
creates a multiple sequence alignment, and is described in Feng et al., J. MoL EvoL 

15 (1987) 25:351. Another method, GAP, uses the alignment method of Needleman et al., 
J. MoL BioL (1970) 48:443. GAP is best suited for global alignment of sequences. A 
third method, BestFit, functions by inserting gaps to maximize the number of matches 
using the local homology algorithm of Smith et al., Adv. AppL Math. (1981) 2:482. In 
general, the following factors are used to determine if a similarity between a query 

20 sequence and a profile or MSA exists: (1) number of conserved residues found in the 
query sequence, (2) percentage of conserved residues found in the query sequence, (3) 
number of frameshifts, and (4) spacing between conserved residues. 

Some alignment programs that both translate and align sequences can 
make any number of frameshifts when translating the nucleotide sequence to produce 

25 the best alignment. The fewer frameshifts needed to produce an alignment, the stronger 
the similarity or identity between the query and profile or MSAs. For example, a weak 
similarity resulting from no frameshifts can be a better indication of activity or structure 
of a query sequence, than a strong similarity resulting from two frameshifts. Preferably, 
three or fewer frameshifts are found in an alignment; more preferably two or fewer 

30 frameshifts; even more preferably, one or fewer frameshifts; even more preferably, no 
frameshifts are found in an alignment of query and profile or MSAs. 

Conserved residues are those amino acids found at a particular position 
in all or some of the family or motif members. Alternatively, a position is considered 
conserved if only a certain class of amino acids is found in a particular position in all or 
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some of the family members. For example, the N-terminal position can contain a 
positively charged amino acid, such as lysine, arginine, or histidine. 



acids or a single amino acid is found at a particular position in at least about 40% of all 
5 class members; more typically, at least about 50%; even more typically, at least about 
60% of the members. Usually, a residue is conserved when a class or single amino acid 
is found in at least about 70% of the members of a family or motif; more usually, at 
least about 80%; even more usually, at least about 90%; even more usually, at least 
about 95%. 

10 A residue is considered conserved when three unrelated amino acids are 

found at a particular position in the some or all of the members; more usually, two 
unrelated amino acids. These residues are conserved when the unrelated amino acids 
are found at particular positions in at least about 40% of all class member; more 
typically, at least about 50%; even more typically, at least about 60% of the members. 

15 Usually, a residue is conserved when a class or single amino acid is found in at least 
about 70% of the members of a family or motif; more usually, at least about 80%; even 
more usually, at least about 90%; even more usually, at least about 95%. 



sequence comprises at least about 25% of the conserved residues of the profile or MSA; 
20 more usually, at least about 30%; even more usually; at least about 40%. Typically, the 
query sequence has a stronger similarity to a profile sequence or MSA when the query 
sequence comprises at least about 45% of the conserved residues of the profile or MSA; 
more typically, at least about 50%; even more typically; at least about 55%. 

Identification of Secreted and Membrane-Bound Polypeptides 

25 Both secreted and membrane-bound polypeptides of the present 

invention are of particular interest. For example, levels of secreted polypeptides can be 
assayed in body fluids that are convenient, such as blood, plasma, serum, and other 
body fluids such as urine, prostatic fluid and semen. Membrane-bound polypeptides are 
useful for constructing vaccine antigens or inducing an immune response. Such 

30 antigens would comprise all or part of the extracellular region of the membrane-bound 
polypeptides. Because both secreted and membrane-bound polypeptides comprise a 
fragment of contiguous hydrophobic amino acids, hydrophobicity predicting algorithms 
can be used to identify such polypeptides. 



Typically, a residue of a polypeptide is conserved when a class of amino 



A query sequence has similarity to a profile or MSA when the query 
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A signal sequence is usually encoded by both secreted and membrane- 
bound polypeptide genes to direct a polypeptide to the surface of the cell. The signal 
sequence usually comprises a stretch of hydrophobic residues. Such signal sequences 
can fold into helical structures. Membrane-bound polypeptides typically comprise at 
5 least one transmembrane region that possesses a stretch of hydrophobic amino acids that 
can transverse the membrane. Some transmembrane regions also exhibit a helical 
structure. Hydrophobic fragments within a polypeptide can be identified by using 
computer algorithms. Such algorithms include Hopp & Woods, Proc. Natl Acad. Scl 
USA (1981) 75:3824-3828; Kyte & Doolittle, J. Mol Biol (1982) 757: 105-132; and 

10 RAOAR algorithm, Degli Esposti et aL, Eur. J. Biochem. (1990) 190: 207-219. 

Another method of identifying secreted and membrane-bound 
polypeptides is to translate the polynucleotides of the invention in all six frames and 
determine if at least 8 contiguous hydrophobic amino acids are present. Those 
translated polypeptides with at least 8; more typically, 10; even more typically, 12 

15 contiguous hydrophobic amino acids are considered to be either a putative secreted or 
membrane bound polypeptide. Hydrophobic amino acids include alanine, glycine, 
histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, threonine, 
tryptophan, tyrosine, and valine 

Identification of the Function of an Expression Product of a Full-Length Gene 
20 Ribozymes, antisense constructs, and dominant negative mutants can be 

used to determine function of the expression product of a gene corresponding to a 
polynucleotide provided herein. The phosphoramidite method of oligonucleotide 
synthesis can be used to construct antisense molecules and ribozymes. See Beaucage et 
aL, Tet. Lett. (1981) 22:1859 and U.S. Patent No. 4,668,777. Automated devices for 
25 synthesis are available to create oligonucleotides using this chemistry. Examples of 
such devices include Biosearch 8600, Models 392 and 394 by Applied Biosy stems, a 
division of Perkin-Elmer Corp., Foster City, California, USA; and Expedite by 
Perceptive Biosystems, Framingham, Massachusetts, USA. Synthetic RNA, phosphate 
analog oligonucleotides, and chemically derivatized oligonucleotides can also be 
30 produced, and can be covalently attached to other molecules. RNA oligonucleotides 
can be synthesized, for example, using RNA phosphoramidites. This method can be 
performed on an automated synthesizer, such as Applied Biosystems, Models 392 and 
394, Foster City, California, USA. 
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Oligonucleotides of up to 200 nt can be synthesized, more typically, 100 
nt, more typically 50 nt; even more typically 30 to 40 nt. These synthetic fragments can 
be annealed and ligated together to construct larger fragments. See, for example, 
Sambrook et al., supra. Trans-cleaving catalytic RNAs (ribozymes) are RNA 
5 molecules possessing endoribonuclease activity. Ribozymes are specifically designed 
for a particular target, and the target message must contain a specific nucleotide 
sequence. They are engineered to cleave any RNA species site-specifically in the 
background of cellular RNA. The cleavage event renders the mRNA unstable and 
prevents protein expression. Importantly, ribozymes can be used to inhibit expression 

10 of a gene of unknown function for the purpose of determining its function in an in vitro 
or in vivo context, by detecting the phenotypic effect. 

Antisense nucleic acids are designed to specifically bind to RNA, 
resulting in the formation of RNA-DNA or RNA-RNA hybrids, with an arrest of DNA 
replication, reverse transcription or messenger RNA translation. Antisense 

15 polynucleotides based on a selected polynucleotide sequence can interfere with 
expression of the corresponding gene. Antisense polynucleotides are typically 
generated within the cell by expression from antisense constructs that contain the 
antisense strand as the transcribed strand. Antisense polynucleotides based on the 
disclosed polynucleotides will bind and/or interfere with the translation of mRNA 

20 comprising a sequence complementary to the antisense polynucleotide. The expression 
products of control cells and cells treated with the antisense construct are compared to 
detect the protein product of the gene corresponding to the polynucleotide upon which 
the antisense construct is based. The protein is isolated and identified using routine 
biochemical methods. 

25 Given the extensive background literature and clinical experience in 

antisense therapy, one skilled in the art can use selected polynucleotides of the 
invention as additional potential therapeutics. The choice of polynucleotide can be 
narrowed by first testing them for binding to "hot spot" regions of the genome of 
cancerous cells. If a polynucleotide is identified as binding to a "hot spot," testing the 

30 polynucleotide as an antisense compound in the corresponding cancer cells is 
warranted. 

Dominant negative mutations also are readily generated for 
corresponding proteins that are active as homomul timers. A mutant polypeptide will 
interact with wild-type polypeptides (made from the other allele) and form a non- 
35 functional multimer. Thus, a mutation is in a substrate-binding domain, a catalytic 
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domain, or a cellular localization domain. Preferably, the mutant polypeptide will be 
overproduced. Point mutations are made that have such an effect. In addition, fusion of 
different polypeptides of various lengths to the terminus of a protein can yield dominant 
negative mutants. General strategies are available for making dominant negative 
5 mutants (see, e.g., Herskowitz, Nature (1987) 529:219). Such techniques can be used to 
create loss of function mutations, which are useful for determining protein function. 

Polypeptides and Variants Thereof 

The polypeptides of the invention include those encoded by the disclosed 
polynucleotides, as well as nucleic acids that, by virtue of the degeneracy of the genetic 

1 0 code, are not identical in sequence to the disclosed polynucleotides. Thus, the invention 
includes within its scope a polypeptide encoded by a polynucleotide having the 
sequence of any one of SEQ ID NOs: 1-3351 or a variant thereof. 

In general, the term "polypeptide" as used herein refers to both the full 
length polypeptide encoded by the recited polynucleotide, the polypeptide encoded by 

15 the gene represented by the recited polynucleotide, as well as portions or fragments 
thereof. "Polypeptides" also includes variants of the naturally occurring proteins, where 
such variants are homologous or substantially similar to the naturally occurring protein, 
and can be of an origin of the same or different species as the naturally occurring 
protein (e.g., human, murine, or some other species that naturally expresses the recited 

20 polypeptide, usually a mammalian species). In general, variant polypeptides have a 
sequence that has at least about 80%, usually at least about 90%, and more usually at 
least about 98% sequence identity with a differentially expressed polypeptide of the 
invention, as measured by BLAST using the parameters described above. The variant 
polypeptides can be naturally or non-naturally glycosylated, i.e., the polypeptide has a 

25 glycosylation pattern that differs from the glycosylation pattern found in the 
corresponding naturally occurring protein. 

The invention also encompasses homologs of the disclosed polypeptides 
(or fragments thereof) where the homologs are isolated from other species, i.e., other 
animal or plant species, where such homologs, usually mammalian species, e.g., 

30 rodents, such as mice, rats; domestic animals, e.g., horse, cow, dog, cat; and humans. 
By "homolog" is meant a polypeptide having at least about 35%, usually at least about 
40% and more usually at least about 60% amino acid sequence identity to a particular 
differentially expressed protein as identified above, where sequence identity is 
determined using the BLAST algorithm, with the parameters described above. 

%0 



WO 01/02568 



PCT/US00/18374 



In general, the polypeptides of the subject invention are provided in a 



non-naturally occurring environment, e.g., are separated from their naturally occurring 
environment. In certain embodiments, the subject protein is present in a composition 
that is enriched for the protein as compared to a control. As such, purified polypeptide 
5 is provided, where by purified is meant that the protein is present in a composition that 
is substantially free of non-differentially expressed polypeptides, where by substantially 
free is meant that less than 90%, usually less than 60% and more usually less than 50% 
of the composition is made up of non-differentially expressed polypeptides. 



10 polypeptides include mutants, fragments, and fusions. Mutants can include amino acid 
substitutions, additions or deletions. The amino acid substitutions can be conservative 
amino acid substitutions or substitutions to eliminate non-essential amino acids, such as 
to alter a glycosylation site, a phosphorylation site or an acetylation site, or to minimize 
misfolding by substitution or deletion of one or more cysteine residues that are not 

15 necessary for function. Conservative amino acid substitutions are those that preserve 
the general charge, hydrophobicity/ hydrophilicity, and/or steric bulk of the amino acid 
substituted. Variants can be designed so as to retain biological activity of a particular 
region of the protein (e.g., a functional domain and/or, where the polypeptide is a 
member of a protein family, a region associated with a consensus sequence). Selection 

20 of amino acid alterations for production of variants can be based upon the accessibility 
(interior vs. exterior) of the amino acid (see, e.g., Go et al., Int. J. Peptide Protein Res. 
(1980) 75:211), the thermostability of the variant polypeptide (see, e.g., Querol et al., 
Prot. Eng. (1996) 9:265), desired glycosylation sites (see, e.g., Olsen and Thomsen, J. 
Gen. Microbiol. (1991) 757:579), desired disulfide bridges (see, e.g., Clarke et al., 

25 Biochemistry (1993) 52:4322; and Wakarchuk et al., Protein Eng. (1994) 7:1379), 
desired metal binding sites (see, e.g., Toma et al., Biochemistry (1991) 30:91, and 
Haezerbrouck et al., Protein Eng. (1993) 5:643), and desired substitutions with in 
proline loops (see, e.g., Masul et al., Appl. Env. Microbiol. (1994) 60:3579). Cysteine- 
depleted muteins can be produced as disclosed in U.S. Patent No. 4,959,314. 

30 Variants also include fragments of the polypeptides disclosed herein, 

particularly biologically active fragments and/or fragments corresponding to functional 
domains. Fragments of interest will typically be at least about 1 0 aa to at least about 1 5 
aa in length, usually at least about 50 aa in length, and can be as long as 300 aa in length 
or longer, but will usually not exceed about 1000 aa in length, where the fragment will 

35 have a stretch of amino acids that is identical to a polypeptide encoded by a 



Also within the scope of the invention are variants; variants of 
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polynucleotide having a sequence of any SEQ ID NOs: 1-3351, or a homolog thereof. 
The protein variants described herein are encoded by polynucleotides that are within the 
scope of the invention. The genetic code can be used to select the appropriate codons to 
construct the corresponding variants. 

5 Computer-Related Embodiments 



information, which information is provided in either biochemical form (e.g., as a 
collection of polynucleotide molecules), or in electronic form (e.g., as a collection of 
polynucleotide sequences stored in a computer-readable form, as in a computer system 

10 and/or as part of a computer program). The sequence information of the 
polynucleotides can be used in a variety of ways, e.g., as a resource for gene discovery, 
as a representation of sequences expressed in a selected cell type (e.g., cell type 
markers), and/or as markers of a given disease or disease state. In general, a disease 
marker is a representation of a gene product that is present in all cells affected by 

15 disease either at an increased or decreased level relative to a normal cell (e.g., a cell of 
the same or similar type that is not substantially affected by disease). For example, a 
polynucleotide sequence in a library can be a polynucleotide that represents an mRNA, 
polypeptide, or other gene product encoded by the polynucleotide, that is either 
overexpressed or underexpressed in a breast ductal cell affected by cancer relative to a 

20 normal (i.e., substantially disease-free) breast cell. 



any suitable form, e.g., electronic or biochemical forms. For example, a library of 
sequence information embodied in electronic form comprises an accessible computer 
data file (or, in biochemical form, a collection of nucleic acid molecules) that contains 

25 the representative nucleotide sequences of genes that are differentially expressed (e.g., 
overexpressed or underexpressed) as between, for example, i) a cancerous cell and a 
normal cell; ii) a cancerous cell and a dysplastic cell; iii) a cancerous cell and a cell 
affected by a disease or condition other than cancer; iv) a metastatic cancerous cell and 
a normal cell and/or non-metastatic cancerous cell; v) a malignant cancerous cell and a 

30 non-malignant cancerous cell (or a normal cell) and/or vi) a dysplastic cell relative to a 
normal cell. Other combinations and comparisons of cells affected by various diseases 
or stages of disease will be readily apparent to the ordinarily skilled artisan. 
Biochemical embodiments of the library include a collection of nucleic acids that have 



In general, a library of polynucleotides is a collection of sequence 



The nucleotide sequence information of the library can be embodied in 
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the sequences of the genes in the library, where the nucleic acids can correspond to the 
entire gene in the library or to a fragment thereof, as described in greater detail below. 



sequence information of a plurality of polynucleotide sequences, where at least one of 
5 the polynucleotides has a sequence of any of SEQ ID NOs: 1-3351. By plurality is 
meant at least 2, usually at least 3 and can include up to all of SEQ ID NOs: 1-3351. 
The length and number of polynucleotides in the library will vary with the nature of the 
library, e.g., if the library is an oligonucleotide array, a cDNA array, a computer 
database of the sequence information, etc. 

10 Where the library is an electronic library, the nucleic acid sequence 

information can be present in a variety of media. "Media" refers to a manufacture, 
other than an isolated nucleic acid molecule, that contains the sequence information of 
the present invention. Such a manufacture provides the genome sequence or a subset 
thereof in a form that can be examined by means not directly applicable to the sequence 

15 as it exists in a nucleic acid. For example, the nucleotide sequence of the present 
invention, e.g., the nucleic acid sequences of any of the polynucleotides of SEQ ID 
NOs: 1-3351, can be recorded on computer readable media, e.g., any medium that can be 
read and accessed directly by a computer. Such media include, but are not limited to: 
magnetic storage media, such as a floppy disc, a hard disc storage medium, and a 

20 magnetic tape; optical storage media such as CD-ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories such as magnetic/optical storage 
media. One of skill in the art can readily appreciate how any of the presently known 
computer readable mediums can be used to create a manufacture comprising a recording 
of the present sequence information. "Recorded" refers to a process for storing 

25 information on computer readable medium, using any such methods as known in the art. 
Any convenient data storage structure can be chosen, based on the means used to access 
the stored information. A variety of data processor programs and formats can be used 
for storage, e.g., word processing text file, database format, etc. In addition to the 
sequence information, electronic versions of the libraries of the invention can be 

30 provided in conjunction or connection with other computer-readable information and/or 
other types of computer-readable files {e.g., searchable files, executable files, etc., 
including, but not limited to, for example, search program software, etc.). 



information can be accessed for a variety of purposes. Computer software to access 
35 sequence information is publicly available. For example, the BLAST (Altschul et al.. 



The polynucleotide libraries of the subject invention generally comprise 



By providing the nucleotide sequence in computer readable form, the 
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supra.) and BLAZE (Brutlag et al. Comp. Chem. (1993) 77:203) search algorithms on a 
Sybase system can be used to identify open reading frames (ORFs) within the genome 
that contain homology to ORFs from other organisms. 

As used herein, "a computer-based system" refers to the hardware 
5 means, software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware of the computer-based 
systems of the present invention comprises a central processing unit (CPU), input 
means, output means, and data storage means. A skilled artisan can readily appreciate 
that any one of the currently available computer-based system are suitable for use in the 

10 present invention. The data storage means can comprise any manufacture comprising a 
recording of the present sequence information as described above, or a memory access 
means that can access such a manufacture. 

"Search means" refers to one or more programs implemented on the 
computer-based system, to compare a target sequence or target structural motif, or 

1 5 expression levels of a polynucleotide in a sample, with the stored sequence information. 
Search means can be used to identify fragments or regions of the genome that match a 
particular target sequence or target motif. A variety of known algorithms are publicly 
known and commercially available, e.g., MacPattern (EMBL), BLASTN and BLASTX 
(NCBI). A "target sequence" can be any polynucleotide or amino acid sequence of six 

20 or more contiguous nucleotides or two or more amino acids, preferably from about 10 
to 100 amino acids or from about 30 to 300 nt. A variety of comparing means can be 
used to accomplish comparison of sequence information from a sample (e.g., to analyze 
target sequences, target motifs, or relative expression levels) with the data storage 
means. A skilled artisan can readily recognize that any one of the publicly available 

25 homology search programs can be used as the search means for the computer based 
systems of the present invention to accomplish comparison of target sequences and 
motifs. Computer programs to analyze expression levels in a sample and in controls are 
also known in the art. 

A "target structural motif," or "target motif," refers to any rationally 

30 selected sequence or combination of sequences in which the sequence(s) are chosen 
based on a three-dimensional configuration that is formed upon the folding of the target 
motif, or on consensus sequences of regulatory or active sites. There are a variety of 
target motifs known in the art. Protein target motifs include, but arc not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are 
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not limited to, hairpin structures, promoter sequences and other expression elements 
such as binding sites for transcription factors. 

A variety of structural formats for the input and output means can be 
used to input and output the information in the computer-based systems of the present 
5 invention. One format for an output means ranks the relative expression levels of 
different polynucleotides. Such presentation provides a skilled artisan with a ranking of 
relative expression levels to determine a gene expression profile. 

As discussed above, the "library" of the invention also encompasses 
biochemical libraries of the polynucleotides of SEQ ID NOs:l-3351, e.g., collections of 

10 nucleic acids representing the provided polynucleotides. The biochemical libraries can 
take a variety of forms, e.g., a solution of cDNAs, a pattern of probe nucleic acids stably 
associated with a surface of a solid support (i.e., an array) and the like. Of particular 
interest are nucleic acid arrays in which one or more of SEQ ID NOs: 1-3351 is 
represented on the array. By array is meant an article of manufacture that has at least a 

1 5 substrate with at least two distinct nucleic acid targets on one of its surfaces, where the 
number of distinct nucleic acids can be considerably higher, typically being at least 10 
nt, usually at least 20 nt and often at least 25 nt. A variety of different array formats 
have been developed and are known to those of skill in the art. The arrays of the subject 
invention find use in a variety of applications, including gene expression analysis, drug 

20 screening, mutation analysis and the like, as disclosed in the above-listed exemplary 
patent documents. 

In addition to the above nucleic acid libraries, analogous libraries of 
polypeptides are also provided, where the where the polypeptides of the library will 
represent at least a portion of the polypeptides encoded by SEQ ID NOs: 1-3351. 



25 Use of Polynucleotide Probes in Mapping, and in Tissue Profiline 

Polynucleotide probes, generally comprising at least 12 contiguous nt of 
a polynucleotide as shown in the Sequence Listing, are used for a variety of purposes, 
such as chromosome mapping of the polynucleotide and detection of transcription 
levels. Additional disclosure about preferred regions of the disclosed polynucleotide 

30 sequences is found in the Examples. A probe that hybridizes specifically to a 
polynucleotide disclosed herein should provide a detection signal at least 5-, 10-, or 20- 
fold higher than the background hybridization provided with other unrelated sequences. 

Detection of Expression Levels . Nucleotide probes are used to detect 
expression of a gene corresponding to the provided polynucleotide. In Northern blots, 

"7-5 
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mRNA is separated electrophoretically and contacted with a probe. A probe is detected 
as hybridizing to an mRNA species of a particular size. The amount of hybridization is 
quantitated to determine relative amounts of expression, for example under a particular 
condition. Probes are used for in situ hybridization to cells to detect expression. Probes 
5 can also be used in vivo for diagnostic detection of hybridizing sequences. Probes are 
typically labeled with a radioactive isotope. Other types of detectable labels can be 
used such as chromophores, fluors, and enzymes. Other examples of nucleotide 
hybridization assays are described in WO92/02526 and U.S. Patent No. 5,124,246. 

Alternatively, the Polymerase Chain Reaction (PCR) is another means 

10 for detecting small amounts of target nucleic acids (see, e.g., Mullis et al. ? Meth. 
Enzymol (1987) 755:335; U.S. Patent No. 4,683,195; and U.S. Patent No. 4,683,202). 
Two primer polynucleotides nucleotides that hybridize with the target nucleic acids are 
used to prime the reaction. The primers can be composed of sequence within or 3' and 
5 1 to the polynucleotides of the Sequence Listing. Alternatively, if the primers are 3' and 

15 5 T to these polynucleotides, they need not hybridize to them or the complements. After 
amplification of the target with a thermostable polymerase, the amplified target nucleic 
acids can be detected by methods known in the art, e.g., Southern blot. mRNA or 
cDNA can also be detected by traditional blotting techniques (e.g., Southern blot, 
Northern blot, etc.) described in Sambrook et al., "Molecular Cloning: A Laboratory 

20 Manual" (New York, Cold Spring Harbor Laboratory, 1989) (e.g., without PCR 
amplification). In general, mRNA or cDNA generated from mRNA using a polymerase 
enzyme can be purified and separated using gel electrophoresis, and transferred to a 
solid support, such as nitrocellulose. The solid support is exposed to a labeled probe, 
washed to remove any unhybridized probe, and duplexes containing the labeled probe 

25 are detected. 

Mapping . Polynucleotides of the present invention can be used to 
identify a chromosome on which the corresponding gene resides. Such mapping can be 
useful in identifying the function of the polynucleotide-related gene by its proximity to 
other genes with known function. Function can also be assigned to the polynucleotide- 

30 related gene when particular syndromes or diseases map to the same chromosome. For 
example, use of polynucleotide probes in identification and quantification of nucleic 
acid sequence aberrations is described in U.S. Patent No. 5,783,387. An exemplary 
mapping method is fluorescence in situ hybridization (FISH), which facilitates 
comparative genomic hybridization to allow total genome assessment of changes in 

35 relative copy number of DNA sequences (see, e.g., Valdes et al., Methods in Molecular 
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Biology (1997) 68:1). Polynucleotides can also be mapped to particular chromosomes 
using, for example, radiation hybrids or chromosome-specific hybrid panels. See Leach 
et al., Advances in Genetics, (1995) 35:63-99; Walter et al., Nature Genetics (1994) 
7:22; Walter and Goodfellow, Trends in Genetics (1992) 9:352. Panels for radiation 
5 hybrid mapping are available from Research Genetics, Inc., Huntsville, Alabama, USA. 
The statistical program RHMAP can be used to construct a map based on the data from 
radiation hybridization with a measure of the relative likelihood of one order versus 
another. RHMAP is available via the world wide web at http://www.sph.umich.edu- 
/group/statgen/software. In addition, commercial programs are available for identifying 

1 0 regions of chromosomes commonly associated with disease, such as cancer. 

Tissue Typing or Profiling . Expression of specific mRNA 
corresponding to the provided polynucleotides can vary in different cell types and can 
be tissue-specific. This variation of mRNA levels in different cell types can be 
exploited with nucleic acid probe assays to determine tissue types. For example, PCR, 

15 branched DNA probe assays, or blotting techniques utilizing nucleic acid probes 
substantially identical or complementary to polynucleotides listed in the Sequence 
Listing can determine the presence or absence of the corresponding cDNA or mRNA. 

Tissue typing can be used to identify the developmental organ or tissue 
source of a metastatic lesion by identifying the expression of a particular marker of that 

20 organ or tissue. If a polynucleotide is expressed only in a specific tissue type, and a 
metastatic lesion is found to express that polynucleotide, then the developmental source 
of the lesion has been identified. Expression of a particular polynucleotide can be 
assayed by detection of either the corresponding mRNA or the protein product. 

Use of Polymorphisms . A polynucleotide of the invention can be used in 

25 forensics, genetic analysis, mapping, and diagnostic applications where the 
corresponding region of a gene is polymorphic in the human population. Any means for 
detecting a polymorphism in a gene can be used, including, but not limited to 
electrophoresis of protein polymorphic variants, differential sensitivity to restriction 
enzyme cleavage, and hybridization to allele-specific probes. 

30 Antibody Production 

Expression products of a polynucleotide of the invention, as well as the 
corresponding mRNA, cDNA, or complete gene, can be prepared and used for raising 
antibodies for experimental, diagnostic, and therapeutic purposes. For polynucleotides 
to which a corresponding gene has not been assigned, this provides an additional 
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method of identifying the corresponding gene. The polynucleotide or related cDNA is 
expressed as described above, and antibodies are prepared. These antibodies are 
specific to an epitope on the polypeptide encoded by the polynucleotide, and can 
precipitate or bind to the corresponding native protein in a cell or tissue preparation or 
5 in a cell-free extract of an in vitro expression system. 

Methods for production of monoclonal and polyclonal antibodies that 
specifically bind a selected antigen are well known in the art. The antibodies 
specifically bind to epitopes present in the polypeptides encoded by polynucleotides 
disclosed in the Sequence Listing. Typically, at least 6, 8, 10, or 12 contiguous amino 

10 acids are required to form an epitope. Epitopes that involve non-contiguous amino 
acids may require a longer polypeptide, e.g., at least 15, 25, or 50 amino acids. 
Antibodies that specifically bind to human polypeptides encoded by the provided 
polynucleotides should provide a detection signal at least 5-, 10-, or 20-fold higher than 
a detection signal provided with other proteins when used in Western blots or other 

15 immunochemical assays. Preferably, antibodies that specifically polypeptides of the 
invention do not bind to other proteins in immunochemical assays at detectable levels 
and can immunoprecipitate the specific polypeptide from solution. 

The invention also contemplates naturally occurring antibodies specific 
for a polypeptide of the invention. For example, serum antibodies to a polypeptide of 

20 the invention in a human population can be purified by methods well known in the art, 
e.g., by passing antiserum over a column to which the corresponding selected 
polypeptide or fusion protein is bound. The bound antibodies can then be eluted from 
the column, for example using a buffer with a high salt concentration. 

In addition to the antibodies discussed above, the invention also 

25 contemplates genetically engineered antibodies, antibody derivatives {e.g., single chain 
antibodies, antibody fragments (e.g., Fab, etc.)), according to methods well known in 
the art. 

Other embodiments of the present invention include humanized 
monoclonal antibodies capable of binding to the polypeptides of the invention. The 

30 phrase "humanized antibody" refers to an antibody derived from a non-human antibody 
- typically a mouse monoclonal antibody. Alternatively, a humanized antibody may be 
derived from a chimeric antibody that retains or substantially retains the antigen- 
binding properties of the parental, non-human, antibody but which exhibits diminished 
immunogenicity as compared to the parental antibody when administered to humans. 

35 The phrase "chimeric antibody," as used herein, refers to an antibody containing 
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sequence derived from two different antibodies (see, e.g., U.S. Patent No. 4,816,567) 
which typically originate from different species. Most typically, chimeric antibodies 
comprise human and murine antibody fragments, generally human constant and mouse 
variable regions. 

5 Because humanized antibodies are far less immunogenic in humans than 

the parental mouse monoclonal antibodies, they can be used for the treatment of humans 
with far less risk of anaphylaxis. Thus, these antibodies may be preferred in therapeutic 
applications that involve in vivo administration to a human such as, e.g., use as radiation 
sensitizers for the treatment of neoplastic disease or use in methods to reduce the side 
1 0 effects of, e.g., cancer therapy. 

Humanized antibodies may be achieved by a variety of methods 
including, for example: (1) grafting the non-human complementarity determining 
regions (CDRs) onto a human framework and constant region (a process referred to in 
the art as "humanizing"), or, alternatively, (2) transplanting the entire non-human 
15 variable domains, but "cloaking" them with a human-like surface by replacement of 
surface residues (a process referred to in the art as "veneering"). In the present 
invention, humanized antibodies will include both "humanized" and "veneered" 
antibodies. These methods are disclosed in, e.g., Jones et al, Nature 527:522-525 
(1986); Morrison et al, Proc. Natl. Acad. Sci, U.S.A., 57:6851-6855 (1984); Morrison 
20 and Oi, Adv. Immunol, 44:65-92 (1988); Verhoeyer et al., Science 259:1534-1536 
(1988); Padlan, Molec. Immun. 28:489-498 (1991); Padlan, Molec. Immunol 31(3)\\69- 
217 (1994); and Kettleborough, C.A. et al, Protein Eng. 4(7):773-S3 (1991) each of 
which is incorporated herein by reference. 

The phrase "complementarity determining region" refers to amino acid 
25 sequences which together define the binding affinity and specificity of the natural Fv 
region of a native immunoglobulin binding site. See, e.g., Chothia et al., J. Mol Biol. 
79(5:901-917 (1987); Kabat et al., U.S. Dept. of Health and Human Services NIH 
Publication No. 91-3242 (1991). The phrase "constant region" refers to the portion of 
the antibody molecule that confers effector functions. In the present invention, mouse 
30 constant regions are substituted by human constant regions. The constant regions of the 
subject humanized antibodies are derived from human immunoglobulins. The heavy 
chain constant region can be selected from any of the five isotypes: alpha, delta, 
epsilon, gamma or mu. 

One method of humanizing antibodies comprises aligning the non- 
35 human heavy and light chain sequences to human heavy and light chain sequences, 
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selecting and replacing the non-human framework with a human framework based on 
such alignment, molecular modeling to predict the conformation of the humanized 
sequence and comparing to the conformation of the parent antibody. This process is 
followed by repeated back mutation of residues in the CDR region which disturb the 
5 structure of the CDRs until the predicted conformation of the humanized sequence 
model closely approximates the conformation of the non-human CDRs of the parent 
non-human antibody. Such humanized antibodies may be further derivatized to 
facilitate uptake and clearance, e.g., via Ashwell receptors. See, e.g., U.S. Patent Nos. 
5,530,101 and 5,585,089 which patents are incorporated herein by reference. 
10 Humanized antibodies can also be produced using transgenic animals 

that are engineered to contain human immunoglobulin loci. For example, WO 
98/24893 discloses transgenic animals having a human Ig locus wherein the animals do 
not produce functional endogenous immunoglobulins due to the inactivation of 
endogenous heavy and light chain loci. WO 91/10741 also discloses transgenic non- 
15 primate mammalian hosts capable of mounting an immune response to an immunogen, 
wherein the antibodies have primate constant and/or variable regions, and wherein the 
endogenous immunoglobulin-encoding loci are substituted or inactivated. WO 
96/30498 discloses the use of the Cre/Lox system to modify the immunoglobulin locus 
in a mammal, such as to replace all or a portion of the constant or variable region to 
20 form a modified antibody molecule. WO 94/02602 discloses non-human mammalian 
hosts having inactivated endogenous Ig loci and functional human Ig loci. U.S. Patent 
No. 5,939,598 discloses methods of making transgenic mice in which the mice lack 
endogenous heavy claims, and express an exogenous immunoglobulin locus comprising 
one or more xenogeneic constant regions. 
25 Using a transgenic animal described above, an immune response can be 

produced to a selected antigenic molecule, and antibody-producing cells can be 
removed from the animal and used to produce hybridomas that secrete human 
monoclonal antibodies. Immunization protocols, adjuvants, and the like are known in 
the art, and are used in immunization of, for example, a transgenic mouse as described 
30 in WO 96/33735. This publication discloses monoclonal antibodies against a variety of 
antigenic molecules including IL-6, IL-8, TNF , human CD4, L-selectin, gp39, and 
tetanus toxin. The monoclonal antibodies can be tested for the ability to inhibit or 
neutralize the biological activity or physiological effect of the corresponding protein. 
WO 96/33735 discloses that monoclonal antibodies against IL-8, derived from immune 
35 cells of transgenic mice immunized with IL-8, blocked IL-8-induced functions of 
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neutrophils. Human monoclonal antibodies with specificity for the antigen used to 
immunize transgenic animals are also disclosed in WO 96/34096. 

Polynucleotides or Arrays for Diagnostics 

5 Polynucleotide arrays are created by spotting polynucleotide probes onto 

a substrate (e.g., glass, nitrocellose, etc.) in a two-dimensional matrix or array having 
bound probes. The probes can be bound to the substrate by either covalent bonds or by 
non-specific interactions, such as hydrophobic interactions. Samples of polynucleotides 
can be detectably labeled (e.g., using radioactive or fluorescent labels) and then 

10 hybridized to the probes. Double stranded polynucleotides, comprising the labeled 
sample polynucleotides bound to probe polynucleotides, can be detected once the 
unbound portion of the sample is washed away. Techniques for constructing arrays and 
methods of using these arrays are described in EP 799 897; WO 97/29212; WO 
97/27317; EP 785 280; WO 97/02357; U.S. Patent No. 5,593,839; U.S. Patent No. 

15 5,578,832; EP 728 520; U.S. Patent No. 5,599,695; EP 721 016; U.S. Patent No. 
5,556,752; WO 95/22058; and U.S. Patent No. 5,631,734. Arrays can be used to, for 
example, examine differential expression of genes and can be used to determine gene 
function. For example, arrays can be used to detect differential expression of a 
polynucleotide between a test cell and control cell (e.g., cancer cells and normal cells). 

20 For example, high expression of a particular message in a cancer cell, which is not 
observed in a corresponding normal cell, can indicate a cancer specific gene product. 
Exemplary uses of arrays are further described in, for example, Pappalarado et al., Sem. 
Radiation Oncol. (1998) 5:217; and Ramsay, Nature Biotechnol (1998) 16:40. 

Differential Expression in Diagnosis 

25 The polynucleotides of the invention can also be used to detect 

differences in expression levels between two cells, e.g., as a method to identify 
abnormal or diseased tissue in a human. For polynucleotides corresponding to profiles 
of protein families, the choice of tissue can be selected according to the putative 
biological function. In general, the expression of a gene corresponding to a specific 

30 polynucleotide is compared between a first tissue that is suspected of being diseased 
and a second, normal tissue of the human. The tissue suspected of being abnormal or 
diseased can be derived from a different tissue type of the human, but preferably it is 
derived from the same tissue type; for example an intestinal polyp or other abnormal 
growth should be compared with normal intestinal tissue. The normal tissue can be the 
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same tissue as that of the test sample, or any normal tissue of the patient, especially 
those that express the polynucleotide-related gene of interest {e.g., brain, thymus, testis, 
heart, prostate, placenta, spleen, small intestine, skeletal muscle, pancreas, and the 
mucosal lining of the colon). A difference between the polynucleotide-related gene, 
5 mRNA, or protein in the two tissues which are compared, for example in molecular 
weight, amino acid or nucleotide sequence, or relative abundance, indicates a change in 
the gene, or a gene which regulates it, in the tissue of the human that was suspected of 
being diseased. Examples of detection of differential expression and its use in diagnosis 
of cancer are described in U.S. Patent Nos. 5,688,641 and 5,677,125. 

10 A genetic predisposition to disease in a human can also be detected by 

comparing expression levels of an mRNA or protein corresponding to a polynucleotide 
of the invention in a fetal tissue with levels associated in normal fetal tissue. Fetal 
tissues that are used for this purpose include, but are not limited to, amniotic fluid, 
chorionic villi, blood, and the blastomere of an in v/Yro-fertilized embryo. The 

15 comparable normal polynucleotide-related gene is obtained from any tissue. The mRNA 
or protein is obtained from a normal tissue of a human in which the polynucleotide- 
related gene is expressed. Differences such as alterations in the nucleotide sequence or 
size of the same product of the fetal polynucleotide-related gene or mRNA, or 
alterations in the molecular weight, amino acid sequence, or relative abundance of fetal 

20 protein, can indicate a germline mutation in the polynucleotide-related gene of the fetus, 
which indicates a genetic predisposition to disease. In general, diagnostic, prognostic, 
and other methods of the invention based on differential expression involve detection of 
a level or amount of a gene product, particularly a differentially expressed gene product, 
in a test sample obtained from a patient suspected of having or being susceptible to a 

25 disease {e.g., breast cancer, lung cancer, colon cancer and/or metastatic forms thereof), 
and comparing the detected levels to those levels found in normal cells {e.g., cells 
substantially unaffected by cancer) and/or other control cells {e.g., to differentiate a 
cancerous cell from a cell affected by dysplasia). Furthermore, the severity of the 
disease can be assessed by comparing the detected levels of a differentially expressed 

30 gene product with those levels detected in samples representing the levels of 
differentially gene product associated with varying degrees of severity of disease. It 
should be noted that use of the term "diagnostic" herein is not necessarily meant to 
exclude "prognostic" or "prognosis," but rather is used as a matter of convenience. 

The term "differentially expressed gene" is generally intended to 

35 encompass a polynucleotide that can, for example, include an open reading frame 
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encoding a gene product (e.g., a polypeptide), and/or introns of such genes and adjacent 
5 ? and 3' non-coding nucleotide sequences involved in the regulation of expression, up 
to about 20 kb beyond the coding region, but possibly further in either direction. The 
gene can be introduced into an appropriate vector for extrachromosomal maintenance or 
5 for integration into a host genome. In general, a difference in expression level 
associated with a decrease in expression level of at least about 25%, usually at least 
about 50% to 75%, more usually at least about 90% or more is indicative of a 
differentially expressed gene of interest, i.e., a gene that is underexpressed or down- 
regulated in the test sample relative to a control sample. Furthermore, a difference in 

10 expression level associated with an increase in expression of at least about 25%, usually 
at least about 50% to 75%, more usually at least about 90% and can be at least about 
1 V^-fold, usually at least about 2-fold to about 10-fold, and can be about 100-fold to 
about 1,000-fold increase relative to a control sample is indicative of a differentially 
expressed gene of interest, i.e., an overexpressed or up-regulated gene. 

15 "Differentially expressed polynucleotide" as used herein means a nucleic 

acid molecule (RNA or DNA) comprising a sequence that represents a differentially 
expressed gene, e.g., the differentially expressed polynucleotide comprises a sequence 
(e.g., an open reading frame encoding a gene product) that uniquely identifies a 
differentially expressed gene so that detection of the differentially expressed 

20 polynucleotide in a sample is correlated with the presence of a differentially expressed 
gene in a sample. "Differentially expressed polynucleotides" is also meant to 
encompass fragments of the disclosed polynucleotides, e.g., fragments retaining 
biological activity, as well as nucleic acids homologous, substantially similar, or 
substantially identical (e.g., having about 90% sequence identity) to the disclosed 

25 polynucleotides. 



subject's susceptibility to a disease or disorder, determination as to whether a subject is 
presently affected by a disease or disorder, as well as to the prognosis of a subject 
affected by a disease or disorder (e:g, identification of pre-metastatic or metastatic 
30 cancerous states, stages of cancer, or responsiveness of cancer to therapy). The present 
invention particularly encompasses diagnosis of subjects in the context of breast cancer 
(e.g., carcinoma in situ (e.g., ductal carcinoma in situ), estrogen receptor (ER)-positive 
breast cancer, ER-negative breast cancer, or other forms and/or stages of breast cancer), 
lung cancer (e.g., small cell carcinoma, non-small cell carcinoma, mesothelioma, and 



Diagnosis" as used herein generally includes determination of a 
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other forms and/or stages of lung cancer), and colon cancer (e.g., adenomatous polyp, 
colorectal carcinoma, and other forms and/or stages of colon cancer). 



meant to refer to samples of biological fluids or tissues, particularly samples obtained 
5 from tissues, especially from cells of the type associated with the disease for which the 
diagnostic application is designed (e.g., ductal adenocarcinoma), and the like. 
"Samples" is also meant to encompass derivatives and fractions of such samples (e.g., 
cell lysates). Where the sample is solid tissue, the cells of the tissue can be dissociated 
or tissue sections can be analyzed. 

10 Methods of the subject invention useful in diagnosis or prognosis 

typically involve comparison of the abundance of a selected differentially expressed 
gene product in a sample of interest with that of a control to determine any relative 
differences in the expression of the gene product, where the difference can be measured 
qualitatively and/or quantitatively. Quantitation can be accomplished, for example, by 

1 5 comparing the level of expression product detected in the sample with the amounts of 
product present in a standard curve. A comparison can be made visually; by using a 
technique such as densitometry, with or without computerized assistance; by preparing 
a representative library of cDNA clones of mRNA isolated from a test sample, 
sequencing the clones in the library to determine that number of cDNA clones 

20 corresponding to the same gene product, and analyzing the number of clones 
corresponding to that same gene product relative to the number of clones of the same 
gene product in a control sample; or by using an array to detect relative levels of 
hybridization to a selected sequence or set of sequences, and comparing the 
hybridization pattern to that of a control. The differences in expression are then 

25 correlated with the presence or absence of an abnormal expression pattern. A variety of 
different methods for determining the nucleic acid abundance in a sample are known to 
those of skill in the art (see, e.g., WO 97/273 17). In general, diagnostic assays of the 
invention involve detection of a gene product of a the polynucleotide sequence (e.g., 
mRNA or polypeptide) that corresponds to a sequence of SEQ ID NOs: 1-3351. The 

30 patient from whom the sample is obtained can be apparently healthy, susceptible to 
disease (e.g., as determined by family history or exposure to certain environmental 
factors), or can already be identified as having a condition in which altered expression 
of a gene product of the invention is implicated. 



35 levels of a gene product encoded by at least one, preferably at least two or more, at least 



Sample" or "biological sample" as used throughout here are generally 



Diagnosis can be determined based on detected gene product expression 
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3 or more, or at least 4 or more of the polynucleotides having a sequence set forth in 
SEQ ID NOs: 1-3351, and can involve detection of expression of genes corresponding to 
all of SEQ ID NOs: 1-3351 and/or additional sequences that can serve as additional 
diagnostic markers and/or reference sequences. Where the diagnostic method is 
5 designed to detect the presence or susceptibility of a patient to cancer, the assay 
preferably involves detection of a gene product encoded by a gene corresponding to a 
polynucleotide that is differentially expressed in cancer. Examples of such differentially 
expressed polynucleotides are described in the Examples below. Given the provided 
polynucleotides and information regarding their relative expression levels provided 

10 herein, assays using such polynucleotides and detection of their expression levels in 
diagnosis and prognosis will be readily apparent to the ordinarily skilled artisan. 

Any of a variety of detectable labels can be used in connection with the 
various embodiments of the diagnostic methods of the invention. Suitable detectable 
labels include fluorochromes, (e.g., fluorescein isothiocyanate (FITC), rhodamine, 

15 Texas Red, phycoerythrin, allophycocyanin, 6-carboxyfluorescein (6-FAM), 
2 ' ,7 ' -dimethoxy-4 5 , 5 ' -dichl oro-6-carboxy fluorescein, 6-carboxy-X-rhodamine (ROX), 
6-carboxy-2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM) or 
N,N,N\N'-tetramethyl-6-carboxyrhodamine (TAMRA)), radioactive labels, (e.g., 32 P, 
35 S, 3 H, etc.), and the like. The detectable label can involve a two stage systems (e.g., 

20 biotin-avidin, hapten-anti-hapten antibody, etc.) 

Reagents specific for the polynucleotides and polypeptides of the 
invention, such as antibodies and nucleotide probes, can be supplied in a kit for 
detecting the presence of an expression product in a biological sample. The kit can also 
contain buffers or labeling components, as well as instructions for using the reagents to 

25 detect and quantify expression products in the biological sample. Exemplary 
embodiments of the diagnostic methods of the invention are described below in more 
detail. 

Polypeptide detection in diagnosis . In one embodiment, the test sample 
is assayed for the level of a differentially expressed polypeptide. Diagnosis can be 

30 accomplished using any of a number of methods to determine the absence or presence 
or altered amounts of the differentially expressed polypeptide in the test sample. For 
example, detection can utilize staining of cells or histological sections with labeled 
antibodies, performed in accordance with conventional methods. Cells can be 
permeabilized to stain cytoplasmic molecules. In general, antibodies that specifically 

35 bind a differentially expressed polypeptide of the invention are added to a sample, and 
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incubated for a period of time sufficient to allow binding to the epitope, usually at least 
about 10 minutes. The antibody can be detectably labeled for direct detection (e.g., 
using radioisotopes, enzymes, fluorescers, chemiluminescers, and the like), or can be 
used in conjunction with a second stage antibody or reagent to detect binding (e.g., 
5 biotin with horseradish peroxidase-conjugated avidin, a secondary antibody conjugated 
to a fluorescent compound, e.g., fluorescein, rhodamine, Texas red, etc.). The absence 
or presence of antibody binding can be determined by various methods, including flow 
cytometry of dissociated cells, microscopy, radiography, scintillation counting, etc. 
Any suitable alternative methods can of qualitative or quantitative detection of levels or 

10 amounts of differentially expressed polypeptide can be used, for example ELISA, 
western blot, immunoprecipitation, radioimmunoassay, etc. 

mRNA detection . The diagnostic methods of the invention can also or 
alternatively involve detection of mRNA encoded by a gene corresponding to a 
differentially expressed polynucleotides of the invention. Any suitable qualitative or 

15 quantitative methods known in the art for detecting specific mRNAs can be used. 
mRNA can be detected by, for example, in situ hybridization in tissue sections, by 
reverse transcriptase-PCR, or in Northern blots containing poly A+ mRNA. One of 
skill in the art can readily use these methods to determine differences in the size or 
amount of mRNA transcripts between two samples. mRNA expression levels in a 

20 sample can also be determined by generation of a library of expressed sequence tags 
(ESTs) from the sample, where the EST library is representative of sequences present in 
the sample (Adams, et al., (1991) Science 252:1651). Enumeration of the relative 
representation of ESTs within the library can be used to approximate the relative 
representation of the gene transcript within the starting sample. The results of EST 

25 analysis of a test sample can then be compared to EST analysis of a reference sample to 
determine the relative expression levels of a selected polynucleotide, particularly a 
polynucleotide corresponding to one or more of the differentially expressed genes 
described herein. Alternatively, gene expression in a test sample can be performed 
using serial analysis of gene expression (SAGE) methodology (e.g., Velculescu et al., 

30 Science (1995) 270:484) or differential display (DD) methodology (see, e.g., U.S. 
Patent NOs. 5,776,683 and 5,807,680). 

Alternatively, gene expression can be analyzed using hybridization 
analysis. Oligonucleotides or cDNA can be used to selectively identify or capture DNA 
or RNA of specific sequence composition, and the amount of RNA or cDNA hybridized 

35 to a known capture sequence determined qualitatively or quantitatively, to provide 

3(e 
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information about the relative representation of a particular message within the pool of 
cellular messages in a sample. Hybridization analysis can be designed to allow for 
concurrent screening of the relative expression of hundreds to thousands of genes by 
using, for example, array-based technologies having high density formats, including 
5 filters, microscope slides, or microchips, or solution-based technologies that use 
spectroscopic analysis (e.g., mass spectrometry). One exemplary use of arrays in the 
diagnostic methods of the invention is described below in more detail. 



of the invention can focus on the expression of a single differentially expressed gene. 

10 For example, the diagnostic method can involve detecting a differentially expressed 
gene, or a polymorphism of such a gene (e.g., a polymorphism in an coding region or 
control region), that is associated with disease. Disease-associated polymorphisms can 
include deletion or truncation of the gene, mutations that alter expression level and/or 
affect activity of the encoded protein, etc. 

15 A number of methods are available for analyzing nucleic acids for the 

presence of a specific sequence, e.g., a disease associated polymorphism. Where large 
amounts of DNA are available, genomic DNA is used directly. Alternatively, the 
region of interest is cloned into a suitable vector and grown in sufficient quantity for 
analysis. Cells that express a differentially expressed gene can be used as a source of 

20 mRNA, which can be assayed directly or reverse transcribed into cDNA for analysis. 
The nucleic acid can be amplified by conventional techniques, such as the polymerase 
chain reaction (PCR), to provide sufficient amounts for analysis, and a detectable label 
can be included in the amplification reaction (e.g., using a detectably labeled primer or 
detectably labeled oligonucleotides) to facilitate detection. Alternatively, various 

25 methods are also known in the art that utilize oligonucleotide ligation as a means of 
detecting polymorphisms, see e.g., Riley et al., Nucl. Acids Res. (1990) 75:2887; and 
Delahunty et al., Am. J. Hum. Genet. (1996) 55:1239. 



number of methods known in the art. The nucleic acid can be sequenced by dideoxy or 
30 other methods, and the sequence of bases compared to a selected sequence, e.g., to a 
wild-type sequence. Hybridization with the polymorphic or variant sequence can also 
be used to determine its presence in a sample (e.g., by Southern blot, dot blot, etc.). The 
hybridization pattern of a polymorphic or variant sequence and a control sequence to an 
array of oligonucleotide probes immobilized on a solid support, as described in U.S. 
35 Patent No. 5,445,934, or in WO 95/35505, can also be used as a means of identifying 



Use of a single gene in diagnostic applications . The diagnostic methods 



The amplified or cloned sample nucleic acid can be analyzed by one of a 
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polymorphic or variant sequences associated with disease. Single strand 
conformational polymorphism (SSCP) analysis, denaturing gradient gel electrophoresis 
(DGGE), and heteroduplex analysis in gel matrices are used to detect conformational 
changes created by DNA sequence variation as alterations in electrophoretic mobility. 
5 Alternatively, where a polymorphism creates or destroys a recognition site for a 
restriction endonuclease, the sample is digested with that endonuclease, and the 
products size fractionated to determine whether the fragment was digested. 
Fractionation is performed by gel or capillary electrophoresis, particularly acrylamide or 
agarose gels. 

10 Screening for mutations in a gene can be based on the functional or 

antigenic characteristics of the protein. Protein truncation assays are useful in detecting 
deletions that can affect the biological activity of the protein. Various immunoassays 
designed to detect polymorphisms in proteins can be used in screening. Where many 
diverse genetic mutations lead to a particular disease phenotype, functional protein 

1 5 assays have proven to be effective screening tools. The activity of the encoded protein 
can be determined by comparison with the wild-type protein. 

Pattern matching in diagnosis using arrays . In another embodiment, the 
diagnostic and/or prognostic methods of the invention involve detection of expression 
of a selected set of genes in a test sample to produce a test expression pattern (TEP). 

20 The TEP is compared to a reference expression pattern (REP), which is generated by 
detection of expression of the selected set of genes in a reference sample (e.g., a 
positive or negative control sample). The selected set of genes includes at least one of 
the genes of the invention, which genes correspond to the polynucleotide sequences of 
SEQ ID NOs: 1 -3351 . Of particular interest is a selected set of genes that includes genes 

25 differentially expressed in the disease for which the test sample is to be screened. 

"Reference sequences" or "reference polynucleotides" as used herein in 
the context of differential gene expression analysis and diagnosis/prognosis refers to a 
selected set of polynucleotides, which selected set includes at least one or more of the 
differentially expressed polynucleotides described herein. A plurality of reference 

30 sequences, preferably comprising positive and negative control sequences, can be 
included as reference sequences. Additional suitable reference sequences are found in 
Genbank, Unigene, and other nucleotide sequence databases (including, e.g., expressed 
sequence tag (EST), partial, and full-length sequences). 

"Reference array" means an array having reference sequences for use in 

35 hybridization with a sample, where the reference sequences include all, at least one of. 
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or any subset of the differentially expressed polynucleotides described herein. Usually 
such an array will include at least 3 different reference sequences, and can include any 
one or all of the provided differentially expressed sequences. Arrays of interest can 
further comprise sequences, including polymorphisms, of other genetic sequences, 
5 particularly other sequences of interest for screening for a disease or disorder (e.g., 
cancer, dysplasia, or other related or unrelated diseases, disorders, or conditions). The 
oligonucleotide sequence on the array will usually be at least about 12 nt in length, and 
can be of about the length of the provided sequences, or can extend into the flanking 
regions to generate fragments of 100 nt to 200 nt in length or more. Reference arrays 

10 can be produced according to any suitable methods known in the art. For example, 
methods of producing large arrays of oligonucleotides are described in U.S. Patent NOs. 
5,134,854 and 5,445,934 using light-directed synthesis techniques. Using a computer 
controlled system, a heterogeneous array of monomers is converted, through 
simultaneous coupling at a number of reaction sites, into a heterogeneous array of 

15 polymers. Alternatively, microarrays are generated by deposition of pre-synthesized 
oligonucleotides onto a solid substrate, for example as described in PCT published 
application no. WO 95/35505. 

A "reference expression pattern" or "REP" as used herein refers to the 
relative levels of expression of a selected set of genes, particularly of differentially 

20 expressed genes, that is associated with a selected cell type, e.g., a normal cell, a 
cancerous cell, a cell exposed to an environmental stimulus, and the like. A "test 
expression pattern" or "TEP" refers to relative levels of expression of a selected set of 
genes, particularly of differentially expressed genes, in a test sample (e.g., a cell of 
unknown or suspected disease state, from which mRNA is isolated). 

25 REPs can be generated in a variety of ways according to methods well 

known in the art. For example, REPs can be generated by hybridizing a control sample 
to an array having a selected set of polynucleotides (particularly a selected set of 
differentially expressed polynucleotides), acquiring the hybridization data from the 
array, and storing the data in a format that allows for ready comparison of the REP with 

30 a TEP. Alternatively, all expressed sequences in a control sample can be isolated and 
sequenced, e.g., by isolating mRNA from a control sample, converting the mRNA into 
cDNA, and sequencing the cDNA. The resulting sequence information roughly or 
precisely reflects the identity and relative number of expressed sequences in the sample. 
The sequence information can then be stored in a format (e.g., a computer-readable 

35 format) that allows for ready comparison of the REP with a TEP. The REP can be 
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normalized prior to or after data storage, and/or can be processed to selectively remove 
sequences of expressed genes that are of less interest or that might complicate analysis 
(e.g., some or all of the sequences associated with housekeeping genes can be 
eliminated from REP data). 
5 TEPs can be generated in a manner similar to REPs, e.g., by hybridizing 

a test sample to an array having a selected set of polynucleotides, particularly a selected 
set of differentially expressed polynucleotides, acquiring the hybridization data from the 
array, and storing the data in a format that allows for ready comparison of the TEP with 
a REP. The REP and TEP to be used in a comparison can be generated simultaneously, 

10 or the TEP can be compared to previously generated and stored REPs. 

In one embodiment of the invention, comparison of a TEP with a REP 
involves hybridizing a test sample with a reference array, where the reference array has 
one or more reference sequences for use in hybridization with a sample. The reference 
sequences include all, at least one of, or any subset of the differentially expressed 

1 5 polynucleotides described herein. Hybridization data for the test sample is acquired, the 
data normalized, and the produced TEP compared with a REP generated using an array 
having the same or similar selected set of differentially expressed polynucleotides. 
Probes that correspond to sequences differentially expressed between the two samples 
will show decreased or increased hybridization efficiency for one of the samples 

20 relative to the other. 

Methods for collection of data from hybridization of samples with a 
reference arrays are well known in the art. For example, the polynucleotides of the 
reference and test samples can be generated using a detectable fluorescent label, and 
hybridization of the polynucleotides in the samples detected by scanning the 

25 microarrays for the presence of the detectable label using, for example, a microscope 
and light source for directing light at a substrate. A photon counter detects fluorescence 
from the substrate, while an x-y translation stage varies the location of the substrate. A 
confocal detection device that can be used in the subject methods is described in U.S. 
Patent No. 5,631,734. A scanning laser microscope is described in Shalon et al., 

30 Genome Res. (1996) 6:639. A scan, using the appropriate excitation line, is performed 
for each fluorophore used. The digital images generated from the scan are then 
combined for subsequent analysis. For any particular array element, the ratio of the 
fluorescent signal from one sample (e.g., a test sample) is compared to the fluorescent 
signal from another sample (e.g., a reference sample), and the relative signal intensity 

35 determined. 
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Methods for analyzing the data collected from hybridization to arrays are 



well known in the art. For example, where detection of hybridization involves a 
fluorescent label, data analysis can include the steps of determining fluorescent intensity 
as a function of substrate position from the data collected, removing outliers, i.e., data 
5 deviating from a predetermined statistical distribution, and calculating the relative 
binding affinity of the targets from the remaining data. The resulting data can be 
displayed as an image with the intensity in each region varying according to the binding 
affinity between targets and probes. 



10 profile corresponding to that associated with a disease or non-disease state by 
comparing the TEP generated from the test sample to one or more REPs generated from 
reference samples (e.g., from samples associated with cancer or specific stages of 
cancer, dysplasia, samples affected by a disease other than cancer, normal samples, 
etc.). The criteria for a match or a substantial match between a TEP and a REP include 

15 expression of the same or substantially the same set of reference genes, as well as 
expression of these reference genes at substantially the same levels (e.g., no significant 
difference between the samples for a signal associated with a selected reference 
sequence after normalization of the samples, or at least no greater than about 25% to 
about 40% difference in signal strength for a given reference sequence. In general, a 

20 pattern match between a TEP and a REP includes a match in expression, preferably a 
match in qualitative or quantitative expression level, of at least one of, all or any subset 
of the differentially expressed genes of the invention. 



a computer program. Methods for preparation of substrate matrices (e.g., arrays), 
25 design of oligonucleotides for use with such matrices, labeling of probes, hybridization 
conditions, scanning of hybridized matrices, and analysis of patterns generated, 
including comparison analysis, are described in, for example, U.S. Patent No. 
5,800,992. 

Diagnosis, Prognosis and Management of Cancer 
30 The polynucleotides of the invention and their gene products are of 

particular interest as genetic or biochemical markers (e.g., in blood or tissues) that will 
detect the earliest changes along the carcinogenesis pathway and/or to monitor the 
efficacy of various therapies and preventive interventions. For example, the level of 
expression of certain polynucleotides can be indicative of a poorer prognosis, and 



In general, the test sample is classified as having a gene expression 



Pattern matching can be performed manually, or can be performed using 
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therefore warrant more aggressive chemo- or radio-therapy for a patient or vice versa. 
The correlation of novel surrogate tumor specific features with response to treatment 
and outcome in patients can define prognostic indicators that allow the design of 
tailored therapy based on the molecular profile of the tumor. These therapies include 
5 antibody targeting and gene therapy. Determining expression of certain polynucleotides 
and comparison of a patients profile with known expression in normal tissue and 
variants of the disease allows a determination of the best possible treatment for a 
patient, both in terms of specificity of treatment and in terms of comfort level of the 
patient. Surrogate tumor markers, such as polynucleotide expression, can also be used 

10 to better classify, and thus diagnose and treat, different forms and disease states of 
cancer. Two classifications widely used in oncology that can benefit from identification 
of the expression levels of the polynucleotides of the invention are staging of the 
cancerous disorder, and grading the nature of the cancerous tissue. 

The polynucleotides of the invention can be useful to monitor patients 

15 having or susceptible to cancer to detect potentially malignant events at a molecular 
level before they are detectable at a gross morphological level. Furthermore, a 
polynucleotide of the invention identified as important for one type of cancer can also 
have implications for development or risk of development of other types of cancer, e.g., 
where a polynucleotide is differentially expressed across various cancer types. Thus, 

20 for example, expression of a polynucleotide that has clinical implications for metastatic 
colon cancer can also have clinical implications for stomach cancer or endometrial 
cancer. 

Staging . Staging is a process used by physicians to describe how 
advanced the cancerous state is in a patient. Generally, if a cancer is only detectable in 

25 the area of the primary lesion without having spread to any lymph nodes it is called 
Stage I. If it has spread only to the closest lymph nodes, it is called Stage II. In Stage 
III, the cancer has generally spread to the lymph nodes in near proximity to the site of 
the primary lesion. Cancers that have spread to a distant part of the body, such as the 
liver, bone, brain or other site, are Stage IV, the most advanced stage. 

30 The polynucleotides of the invention can facilitate fine-tuning of the 

staging process by identifying markers for the aggresivity of a cancer, e.g., the 
metastatic potential, as well as the presence in different areas of the body. Thus, a Stage 
II cancer with a polynucleotide signifying a high metastatic potential cancer can be used 
to change a borderline Stage II tumor to a Stage III tumor, justifying more aggressive 
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therapy. Conversely, the presence of a polynucleotide signifying a lower metastatic 
potential allows more conservative staging of a tumor. 

Grading of cancers . Grade is a term used to describe how closely a 
tumor resembles normal tissue of its same type. The microscopic appearance of a tumor 
5 is used to identify tumor grade based on parameters such as cell morphology, cellular 
organization, and other markers of differentiation. As a general rule, the grade of a 
tumor corresponds to its rate of growth or aggressiveness, with undifferentiated or high- 
grade tumors being more aggressive than well differentiated or low-grade tumors. The 
following guidelines are generally used for grading tumors: 1) GX Grade cannot be 

10 assessed; 2) Gl Well differentiated; G2 Moderately well differentiated; 3) G3 Poorly 
differentiated; 4) G4 Undifferentiated. The polynucleotides of the invention can be 
especially valuable in determining the grade of the tumor, as they not only can aid in 
determining the differentiation status of the cells of a tumor, they can also identify 
factors other than differentiation that are valuable in determining the aggressivity of a 

15 tumor, such as metastatic potential. 

Detection of lung cancer . The polynucleotides of the invention can be 
used to detect lung cancer in a subject. Although there are more than a dozen different 
kinds of lung cancer, the two main types of lung cancer are small cell and nonsmall cell, 
which encompass about 90% of all lung cancer cases. Small cell carcinoma (also called 

20 oat cell carcinoma) usually starts in one of the larger bronchial tubes, grows fairly 
rapidly, and is likely to be large by the time of diagnosis. Nonsmall cell lung cancer 
(NSCLC) is made up of three general subtypes of lung cancer. Epidermoid carcinoma 
(also called squamous cell carcinoma) usually starts in one of the larger bronchial tubes 
and grows relatively slowly. The size of these tumors can range from very small to 

25 quite large. Adenocarcinoma starts growing near the outside surface of the lung and can 
vary in both size and growth rate. Some slowly growing adenocarcinomas are described 
as alveolar cell cancer. Large cell carcinoma starts near the surface of the lung, grows 
rapidly, and the growth is usually fairly large when diagnosed. Other less common 
forms of lung cancer are carcinoid, cylindroma, mucoepidermoid, and malignant 

30 mesothelioma. 

The polynucleotides of the invention, e.g., polynucleotides differentially 
expressed in normal cells versus cancerous lung cells (e.g., tumor cells of high or low 
metastatic potential) or between types of cancerous lung cells (e.g., high metastatic 
versus low metastatic), can be used to distinguish types of lung cancer as well as 
35 identifying traits specific to a certain patient's cancer and selecting an appropriate 

H h 
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therapy. For example, if the patient's biopsy expresses a polynucleotide that is 
associated with a low metastatic potential, it may justify leaving a larger portion of the 
patient's lung in surgery to remove the lesion. Alternatively, a smaller lesion with 
expression of a polynucleotide that is associated with high metastatic potential may 
5 justify a more radical removal of lung tissue and/or the surrounding lymph nodes, even 
if no metastasis can be identified through pathological examination. 

Detection of breast cancer . The majority of breast cancers are 
adenocarcinomas subtypes, which can be summarized as follows: 1) ductal carcinoma 
in situ (DCIS), including comedocarcinoma; 2) infiltrating (or invasive) ductal 

10 carcinoma (IDC); 3) lobular carcinoma in situ (LCIS); 4) infiltrating (or invasive) 
lobular carcinoma (ILC); 5) inflammatory breast cancer; 6) medullary carcinoma; 
7) mucinous carcinoma; 8) Paget's disease of the nipple; 9) Phyllodes tumor; and 
1 0) tubular carcinoma. 

The expression of polynucleotides of the invention can be used in the 

1 5 diagnosis and management of breast cancer, as well as to distinguish between types of 
breast cancer. Detection of breast cancer can be determined using expression levels of 
any of the appropriate polynucleotides of the invention, either alone or in combination. 
Determination of the aggressive nature and/or the metastatic potential of a breast cancer 
can also be determined by comparing levels of one or more polynucleotides of the 

20 invention and comparing levels of another sequence known to vary in cancerous tissue, 
e.g., ER expression. In addition, development of breast cancer can be detected by 
examining the ratio of expression of a differentially expressed polynucleotide to the 
levels of steroid hormones (e.g., testosterone or estrogen) or to other hormones (e.g., 
growth hormone, insulin). Thus expression of specific marker polynucleotides can be 

25 used to discriminate between normal and cancerous breast tissue, to discriminate 
between breast cancers with different cells of origin, to discriminate between breast 
cancers with different potential metastatic rates, etc. 

Detection of colon cancer . The polynucleotides of the invention 
exhibiting the appropriate expression pattern can be used to detect colon cancer in a 

30 subject. Colorectal cancer is one of the most common neoplasms in humans and 
perhaps the most frequent form of hereditary neoplasia. Prevention and early detection 
are key factors in controlling and curing colorectal cancer. Colorectal cancer begins as 
polyps, which are small, benign growths of cells that form on the inner lining of the 
colon. Over a period of several years, some of these polyps accumulate additional 

35 mutations and become cancerous. Multiple familial colorectal cancer disorders have 
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been identified, which are summarized as follows: 1) Familial adenomatous polyposis 
(FAP); 2) Gardner's syndrome; 3) Hereditary nonpolyposis colon cancer (HNPCC); and 
4) Familial colorectal cancer in Ashkenazi Jews. The expression of appropriate 
polynucleotides of the invention can be used in the diagnosis, prognosis and 
5 management of colorectal cancer. Detection of colon cancer can be determined using 
expression levels of any of these sequences alone or in combination with the levels of 
expression. Determination of the aggressive nature and/or the metastatic potential of a 
colon cancer can be determined by comparing levels of one or more polynucleotides of 
the invention and comparing total levels of another sequence known to vary in 

10 cancerous tissue, e.g., expression of p53, DCC ras, lor FAP (see, e.g., Fearon ER, et ah, 
Cell (1990) 6J(5):759; Hamilton SR et al., Cancer (1993) 72:957; Bodmer W, et al., 
Nat Genet. (1994) 4(3):2\7; Fearon ER, Ann N Y Acad Sci. (1995) 768:101). For 
example, development of colon cancer can be detected by examining the ratio of any of 
the polynucleotides of the invention to the levels of oncogenes (e.g., ras) or tumor 

15 suppressor genes (e.g., FAP or p53). Thus expression of specific marker 
polynucleotides can be used to discriminate between normal and cancerous colon tissue, 
to discriminate between colon cancers with different cells of origin, to discriminate 
between colon cancers with different potential metastatic rates, etc. 

Use of Polynucleotides to Screen for Peptide Analogs and Antagonists 
20 Polypeptides encoded by the instant polynucleotides and corresponding 

full length genes can be used to screen peptide libraries to identify binding partners, 
such as receptors, from among the encoded polypeptides. Peptide libraries can be 
synthesized according to methods known in the art (see, e.g., U.S. Patent No. 5,010,175, 
and WO 91/17823). Agonists or antagonists of the polypeptides if the invention can be 
25 screened using any available method known in the art, such as signal transduction, 
antibody binding, receptor binding, mitogenic assays, chemotaxis assays, etc. The 
assay conditions ideally should resemble the conditions under which the native activity 
is exhibited in vivo, that is, under physiologic pH, temperature, and ionic strength. 
Suitable agonists or antagonists will exhibit strong inhibition or enhancement of the 
30 native activity at concentrations that do not cause toxic side effects in the subject. 
Agonists or antagonists that compete for binding to the native polypeptide can require 
concentrations equal to or greater than the native concentration, while inhibitors capable 
of binding irreversibly to the polypeptide can be added in concentrations on the order of 
the native concentration. 

MS 
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Such screening and experimentation can lead to identification of a novel 
polypeptide binding partner, such as a receptor, encoded by a gene or a cDNA 
corresponding to a polynucleotide of the invention, and at least one peptide agonist or 
antagonist of the novel binding partner. Such agonists and antagonists can be used to 
5 modulate, enhance, or inhibit receptor function in cells to which the receptor is native, 
or in cells that possess the receptor as a result of genetic engineering. Further, if the 
novel receptor shares biologically important characteristics with a known receptor, 
information about agonist/antagonist binding can facilitate development of improved 
agonists/antagonists of the known receptor. 



10 Pharmaceutical Compositions and Therapeutic Uses 

Pharmaceutical compositions of the invention can comprise 
polypeptides, antibodies, or polynucleotides (including antisense nucleotides and 
ribozymes) of the claimed invention in a therapeutically effective amount. The term 
"therapeutically effective amount" as used herein refers to an amount of a therapeutic 

15 agent to treat, ameliorate, or prevent a desired disease or condition, or to exhibit a 
detectable therapeutic or preventative effect. The effect can be detected by, for 
example, chemical markers or antigen levels. Therapeutic effects also include reduction 
in physical symptoms, such as decreased body temperature. The precise effective 
amount for a subject will depend upon the subject's size and health, the nature and 

20 extent of the condition, and the therapeutics or combination of therapeutics selected for 
administration. Thus, it is not useful to specify an exact effective amount in advance. 
However, the effective amount for a given situation is determined by routine 
experimentation and is within the judgment of the clinician. For purposes of the present 
invention, an effective dose will generally be from about 0.01 mg/ kg to 50 mg/kg or 

25 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is 
administered. 

A pharmaceutical composition can also contain a pharmaceutically 
acceptable carrier. The term "pharmaceutically acceptable carrier" refers to a carrier for 
administration of a therapeutic agent, such as antibodies or a polypeptide, genes, and 
30 other therapeutic agents. The term refers to any pharmaceutical carrier that does not 
itself induce the production of antibodies harmful to the individual receiving the 
composition, and which can be administered without undue toxicity. Suitable carriers 
can be large, slowly metabolized macromolecules such as proteins, polysaccharides, 
polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, 
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and inactive virus particles. Such carriers are well known to those of ordinary skill in 
the art. Pharmaceutically acceptable carriers in therapeutic compositions can include 
liquids such as water, saline, glycerol and ethanol. Auxiliary substances, such as 
wetting or emulsifying agents, pH buffering substances, and the like, can also be present 
5 in such vehicles. Typically, the therapeutic compositions are prepared as injectables, 
either as liquid solutions or suspensions; solid forms suitable for solution in, or 
suspension in, liquid vehicles prior to injection can also be prepared. Liposomes are 
included within the definition of a pharmaceutically acceptable carrier. 
Pharmaceutically acceptable salts can also be present in the pharmaceutical 

10 composition, e.g., mineral acid salts such as hydrochlorides, hydrobromides, 
phosphates, sulfates, and the like; and the salts of organic acids such as acetates, 
propionates, malonates, benzoates, and the like. A thorough discussion of 
pharmaceutically acceptable excipients is available in Remington's Pharmaceutical 
Sciences (Mack Pub. Co., New Jersey, 1991). 

15 Delivery Methods . Once formulated, the compositions of the invention 

can be (1) administered directly to the subject (e.g., as polynucleotide or polypeptides); 
or (2) delivered ex vivo, to cells derived from the subject (e.g., as in ex vivo gene 
therapy). Direct delivery of the compositions will generally be accomplished by 
parenteral injection, e.g., subcutaneously, intraperitoneal^, intravenously or 

20 intramuscularly, intratumoral or to the interstitial space of a tissue. Other modes of 
administration include oral and pulmonary administration, suppositories, and 
transdermal applications, needles, and gene guns or hyposprays. Dosage treatment can 
be a single dose schedule or a multiple dose schedule. 



25 into a subject are known in the art and described in e.g., International Publication No. 
WO 93/14778. Examples of cells useful in ex vivo applications include, for example, 
stem cells, particularly hematopoetic, lymph cells, macrophages, dendritic cells, or 
tumor cells. Generally, delivery of nucleic acids for both ex vivo and in vitro 
applications can be accomplished by, for example, dextran-mediated transfection, 

30 calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, 
electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct 
microinjection of the DNA into nuclei, all well known in the art. 



found to correlate with a proliferative disorder, such as neoplasia, dysplasia, and 
35 hyperplasia, the disorder can be amenable to treatment by administration of a 



Methods for the ex vivo delivery and reimplantation of transformed cells 



Once a gene corresponding to a polynucleotide of the invention has been 
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therapeutic agent based on the provided polynucleotide, corresponding polypeptide or 
other corresponding molecule (e.g., antisense, ribozyme, etc.). 

The dose and the means of administration of the inventive 
pharmaceutical compositions are determined based on the specific qualities of the 
5 therapeutic composition, the condition, age, and weight of the patient, the progression 
of the disease, and other relevant factors. For example, administration of 
polynucleotide therapeutic compositions agents of the invention includes local or 
systemic administration, including injection, oral administration, particle gun or 
catheterized administration, and topical administration. Preferably, the therapeutic 

10 polynucleotide composition contains an expression construct comprising a promoter 
operably linked to a polynucleotide of at least 12, 22, 25, 30, or 35 contiguous nt of the 
polynucleotide disclosed herein. Various methods can be used to administer the 
therapeutic composition directly to a specific site in the body. For example, a small 
metastatic lesion is located and the therapeutic composition injected several times in 

1 5 several different locations within the body of tumor. Alternatively, arteries which serve 
a tumor are identified, and the therapeutic composition injected into such an artery, in 
order to deliver the composition directly into the tumor. A tumor that has a necrotic 
center is aspirated and the composition injected directly into the now empty center of 
the tumor. The antisense composition is directly administered to the surface of the 

20 tumor, for example, by topical application of the composition. X-ray imaging is used to 
assist in certain of the above delivery methods. 

Receptor-mediated targeted delivery of therapeutic compositions 
containing an antisense polynucleotide, subgenomic polynucleotides, or antibodies to 
specific tissues can also be used. Receptor-mediated DNA delivery techniques are 

25 described in, for example, Findeis et aL, Trends Biotechnol (1993) 77:202; Chiou et al., 
Gene Therapeutics: Methods And Applications Of Direct Gene Transfer (J. A. Wolff, 
ed.) (1994); Wu et aL, J. Biol. Chem. (1988) 255:621; Wu et al., J. Biol. Chem. (1994) 
269:542; Zenke et al., Proc. Natl. Acad. Set (USA) (1990) 57:3655; Wu et al., J. Biol. 
Chem. (1991) 266:338. Therapeutic compositions containing a polynucleotide are 

30 administered in a range of about 100 ng to about 200 mg of DNA for local 
administration in a gene therapy protocol. Concentration ranges of about 500 ng to 
about 50 mg, about 1 mg to about 2 mg, about 5 mg to about 500 mg, and about 20 mg 
to about 100 mg of DNA can also be used during a gene therapy protocol. Factors such 
as method of action (e.g., for enhancing or inhibiting levels of the encoded gene 

35 product) and efficacy of transformation and expression are considerations which will 
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affect the dosage required for ultimate efficacy of the antisense subgenomic 
polynucleotides. Where greater expression is desired over a larger area of tissue, larger 
amounts of antisense subgenomic polynucleotides or the same amounts readministered 
in a successive protocol of administrations, or several administrations to different 
5 adjacent or close tissue portions of, for example, a tumor site, may be required to effect 
a positive therapeutic outcome. In all cases, routine experimentation in clinical trials 
will determine specific ranges for optimal therapeutic effect. For polynucleotide-related 
genes encoding polypeptides or proteins with anti-inflammatory activity, suitable use, 
doses, and administration are described in U.S. Patent No. 5,654,173. 

10 The therapeutic polynucleotides and polypeptides of the present 

invention can be delivered using gene delivery vehicles. The gene delivery vehicle can 
be of viral or non-viral origin (see generally, Jolly, Cancer Gene Therapy (1994) 7:51; 
Kimura, Human Gene Therapy (1994) 5:845; Connelly, Human Gene Therapy (1995) 
7:185; and Kaplitt, Nature Genetics (1994) 6:148). Expression of such coding 

15 sequences can be induced using endogenous mammalian or heterologous promoters. 
Expression of the coding sequence can be either constitutive or regulated. 

Viral-based vectors for delivery of a desired polynucleotide and 
expression in a desired cell are well known in the art. Exemplary viral-based vehicles 
include, but are not limited to, recombinant retroviruses, (see, e.g., WO 90/07936; WO 

20 94/03622; WO 93/25698; WO 93/25234; U.S. Patent No. 5, 219,740; WO 93/11230; 
WO 93/10218; U.S. Patent No. 4,777,127; GB Patent No. 2,200,651; EP 0 345 242; and 
WO 91/02805), alphavirus-based vectors (e.g., Sindbis virus vectors, Semliki forest 
virus (ATCC VR-67; ATCC VR-1247), Ross River virus (ATCC VR-373; ATCC VR- 
1246) and Venezuelan equine encephalitis virus (ATCC VR-923; ATCC VR-1250; 

25 ATCC VR 1249; ATCC VR-532), and adeno-associated virus (AAV) vectors (see, e.g., 
WO 94/12649, WO 93/03769; WO 93/19191; WO 94/28938; WO 95/11984 and WO 
95/00655). Administration of DNA linked to killed adenovirus as described in Curiel, 
Hum. Gene Ther. (1992) J: 147 can also be employed. 

Non-viral delivery vehicles and methods can also be employed, 

30 including, but not limited to, polycationic condensed DNA linked or unlinked to killed 
adenovirus alone (see, e.g., Curiel, Hum. Gene Ther. (1992) 5:147); ligand-linked 
DNA(see, e.g., Wu, J. Biol Chem. 264:16985 (1989)); eukaryotic cell delivery vehicles 
cells (see, e.g., U.S. Patent No. 5,814,482; WO 95/07994; WO 96/17072; 
WO 95/30763; and WO 97/42338) and nucleic charge neutralization or fusion with cell 

35 membranes. Naked DNA can also be employed. Exemplary naked DNA introduction 
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methods are described in WO 90/1 1092 and U.S. Patent No. 5,580,859. Liposomes that 
can act as gene delivery vehicles are described in U.S. Patent No. 5,422,120; WO 
95/13796; WO 94/23697; WO 91/14445; and EP 0524968. Additional approaches are 
described in Philip, Mol Cell Biol 14:2411 (1994), and in Woffendin, Proc. Natl. 
5 Acad. Sci. (1994) Pi: 1581. 

Further non-viral delivery suitable for use includes mechanical delivery 
systems such as the approach described in Woffendin et al., Proc. Natl Acad. ScL USA 
97(24): 1 1581 (1994). Moreover, the coding sequence and the product of expression of 
such can be delivered through deposition of photopolymerized hydrogel materials or 

10 use of ionizing radiation (see, e.g., U.S. Patent No. 5,206,152 and WO 92/11033). 
Other conventional methods for gene delivery that can be used for delivery of the 
coding sequence include, for example, use of hand-held gene transfer particle gun (see, 
e.g., U.S. Patent No. 5,149,655); use of ionizing radiation for activating transferred gene 
(see, e.g., U.S. Patent No. 5,206,152 and WO 92/11033). 

15 The present invention will now be illustrated by reference to the 

following examples which set forth particularly advantageous embodiments. However, 
it should be noted that these embodiments are illustrative and are not to be construed as 
restricting the invention in any way. 
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EXAMPLES 
EXAMPLE 1 

Source of Biological Materials and Overview of Novel Polynucleotides 
Expressed by the Biological Materials 

5 

Cell lines and human normal and tumor tissue were used to construct 
cDNA libraries from mRNA isolated from the cells and tissues. Most sequences were 
about 275-300 nucleotides in length. The cells lines include Kml2L4-A cell line, a 
high metastatic colon cancer cell line (Morika, W. A. K. et al., Cancer Research (1988) 

10 45:6863). The KM 12L4- A cell line is derived from the KM 12C cell line. The KM12C 
cell line, which is poorly metastatic (low metastatic) was established in culture from a 
Dukes' stage B2 surgical specimen (Morikawa et al. Cancer Res. (1988) 45:6863). The 
KML4-A is a highly metastatic subline derived from KM12C (Yeatman et al. NucL 
Acids. Res. (1995) 23:4007; Bao-Ling et al. Proc. Annu. Meet. Am. Assoc. Cancer. Res. 

15 (1995) 27:3269). The KM12C and KM 1 2C-derived cell lines {e.g., KM12L4, 
KM12L4-A, etc.) are well-recognized in the art as model cell lines for the study of 
colon cancer (see, e.g., Moriakawa et al., supra; Radinsky et al. Clin. Cancer Res. 
(1995) 1:19; Yeatman et aL, (1995) supra; Yeatman et al., Clin. Exp. Metastasis (1996) 
14:246). These and other cell lines and tissue are described in Table 6. 

20 The sequences of the isolated polynucleotides were first masked to 

eliminate low complexity sequences using the XBLAST masking program (Claverie 
"Effective Large-Scale Sequence Similarity Searches," In: Computer Methods for 
Macromolecular Sequence Analysis , Doolittle, ed., Meth. Enzymol. 266:212-227 
Academic Press, NY, NY (1996); see particularly Claverie, in "Automated DNA 

25 Sequencing and Analysis Techniques" Adams et al., eds., Chap. 36, p. 267 Academic 
Press, San Diego, 1994 and Claverie et al. Comput. Chem. (1993) 77:191 ). Generally, 
masking does not influence the final search results, except to eliminate sequences of 
relative little interest due to their low complexity, and to eliminate multiple "hits" based 
on similarity to repetitive regions common to multiple sequences, e.g., Alu repeats. The 

30 sequences remaining after masking were then used in a BLASTN vs. Genbank search; 
sequences that exhibited greater than 70% overlap, 99% identity, and a p value of less 
than 1 x 10" 40 were discarded. Sequences from this search also were discarded if the 
inclusive parameters were met, but the sequence was ribosomal or vector-derived. 

<6l 
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The resulting sequences from the previous search were classified into 
three groups (1, 2 and 3 below) and searched in a BLASTX vs. NRP (non-redundant 
proteins) database search: (1) unknown (no hits in the Genbank search), (2) weak 
similarity (greater than 45% identity and p value of less than 1 x 10" 5 ), and (3) high 
5 similarity (greater than 60% overlap, greater than 80% identity, and p value less than 1 
x 10" 5 ). Sequences having greater than 70% overlap, greater than 99% identity, and p 
value of less than lxl 0" 40 were discarded. 

The remaining sequences were classified as unknown (no hits), weak 
similarity, and high similarity (parameters as above). Two searches were performed on 

10 these sequences. First, a BLAST vs. EST database search was performed and 
sequences with greater than 99% overlap, greater than 99% similarity and a p value of 
less than 1 x 10" 40 were discarded. Sequences with a p value of less than 1 x 10" 65 when 
compared to a database sequence of human origin were also excluded. Second, a 
BLASTN vs. Patent GeneSeq database was performed and sequences having greater 

15 than 99% identity, p value less than 1 x 10" 40 , and greater than 99% overlap were 
discarded. 

The remaining sequences were subjected to screening using other rules 
and redundancies in the dataset. Sequences with a p value of less than 1 x 1 0" 1 1 1 in 
relation to a database sequence of human origin were specifically excluded. The final 

20 result provided the 3351 sequences listed in the accompanying Sequence Listing. Each 
identified polynucleotide represents sequence from at least a partial mRNA transcript. 
Polynucleotides that were determined to be novel were assigned a sequence 
identification number. 

The novel polynucleotides were assigned sequence identification numbers 

25 SEQ ID NOs:l-3351. The first 1847 DNA sequences corresponding to the novel 
polynucleotides are provided in the Sequence Listing in Table 1. DNA sequences 
corresponding to the novel polynucleotides of SEQ ID NOs:l 848-3351 are provided in the 
Sequence Listing in Table 2. The DMA sequences of Table 2, while numbered SEQ ID 1- 
1504, correspond to SEQ ID NOs:l 848-3351 in the Sequence Listing, e.g., Table 2 SEQ ID 

30 1 is SEQ ID NO: 1848, Table 2 SEQ ID 2 is SEQ ID NO: 1849, etc. Each DNA sequence in 
Table 4 is uniquely identified by a number that is 1847 less than its SEQ ID NO in the 
Sequence Listing. Tables 1 and 2 provide: 1) the SEQ ID NO assigned to each sequence 
for use in the present specification or a corresponding number; 2) the sequence name used 
as an internal identifier of the sequence; 3) the name assigned to the clone from which the 
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sequence was isolated; and 4) the number of the cluster to which the sequence is assigned 
(Cluster ID; where the cluster ID is 0, the sequence was not assigned to any cluster). 



transcripts, two or more polynucleotides of the invention may represent different 
5 regions of the same mRNA transcript and the same gene. Thus, if two or more SEQ ID 
NOs: are identified as belonging to the same clone, then either sequence can be used to 
obtain the full-length mRNA or gene. 

EXAMPLE 2 

Results of Public Database Search to Identify Function of Gene Products 

10 



determine the best alignment with the individual sequences. These amino acid 
sequences and nucleotide sequences are referred to, generally, as query sequences, 
which are aligned with the individual sequences. Query and individual sequences were 
15 aligned using the BLAST programs, available over the world wide web at 
http://www.ncbi.nlm.nih.gov/BLAST/. Again the sequences were masked to various 
extents to prevent searching of repetitive sequences or poly-A sequences, using the 
XBLAST program for masking low complexity as described above in Example 1 . 



20 alignments. Table 3 contains alignment information for SEQ ID NOs: 1-1 847 and Table 4 
contains alignment information for SEQ ID NOs: 1848-3351. The DNA sequences of Table 
4, while numbered SEQ ID 1-1504, correspond to SEQ ID NOs: 1 848-335 1 . Each DNA 
sequence in Table 4 is uniquely identified by a number that is 1847 less than its SEQ ID 
NO. Tables 3 and 4 refer to each sequence by its SEQ ID NO or a corresponding number, 

25 the accession numbers and descriptions of nearest neighbors from the Genbank and Non- 
Redundant Protein searches, and the p values of the search results. 



3351 is included in Table 4. The activity of the polypeptide encoded by SEQ ID 
30 NOs:l-3351 is the same or similar to the nearest neighbor reported in Table 3 or 4. The 
accession number of the nearest neighbor is reported, providing a reference to the activities 
exhibited by the nearest neighbor. The search program and database used for the alignment 
also are indicated as well as a calculation of the p value. 



Because the provided polynucleotides represent partial mRNA 



SEQ ID NOs: 1-3351 were translated in all three reading frames to 



Tables 3 and 4 (inserted before the claims) show the results of the 



For each of SEQ ID NOs:l-1847, the best alignment to a protein or DNA 
sequence is included in Table 3, and the best alignment for each of SEQ ID NOs: 1848- 
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Full length sequences or fragments of the polynucleotide sequences of 
the nearest neighbors can be used as probes and primers to identify and isolate the full 
length sequence of SEQ ID NOs: 1-3351. The nearest neighbors can indicate a tissue or 
cell type to be used to construct a library for the full-length sequences of SEQ ID 
5 NOs:l-3351. 

EXAMPLE 3 
Members of Protein Families 

The sequences (SEQ ID NOs:l-3351) were used to conduct a profile 
10 search as described in the specification above. Several of the polynucleotides of the 
invention were found to encode polypeptides having characteristics of a polypeptide 
belonging to a known protein families (and thus represent new members of these 
protein families) and/or comprising a known functional domain (Table 5). "Start" and 
"stop" in Table 3 indicate the position within the individual sequences that align with 
15 the query sequence having the indicated SEQ ID NO. The direction indicates the 
orientation of the query sequence with respect to the individual sequence, where 
forward (for) indicates that the alignment is in the same direction (left to right) as the 
sequence provided in the Sequence Listing and reverse (rev) indicates that the 
alignment is with a sequence complementary to the sequence provided in the Sequence 
20 Listing. 

Some polynucleotides exhibited multiple profile hits because, for 
example, the particular sequence contains overlapping profile regions, and/or the 
sequence contains two different functional domains. These profile hits are described in 
more detail below. 

25 Ank Repeats TANK) . SEQ ID NOs:187, 1268, 1804, 1819, 1830, 1839, 

2652, 3015 and 3267 represent polynucleotides encoding an Ank repeat-containing 
protein. The ankyrin motif is a 33 amino acid sequence named for the protein ankyrin 
which has 24 tandem 33-amino-acid motifs. Ank repeats were originally identified in 
the cell-cycle-control protein cdclO (Breeden et al., Nature (1987) 329:651). Proteins 

30 containing ankyrin repeats include ankyrin, myotropin, I-kappaB proteins, cell cycle 
protein cdclO, the Notch receptor (Matsuno et al., Development (1997) J24(2J):4265); 
G9a (or BAT8) of the class III region of the major histocompatibility complex 
(Biochem J. 290:811-818, 1993), FABP, GABP, 53BP2, Linl2, glp-1, SW14, and 
SW16. The functions of the ankyrin repeats are compatible with a role in protein- 
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protein interactions (Bork, Proteins (1993) 77(4):363; Lambert and Bennet, Eur. J. 
Biochem. (1993) 277:1; Kerr et al., Current Op. Cell Biol (1992) ¥:496; Bennet et al., 
J. Biol Chem. (1980) 255:6424). 

ATPases Associated with Various Cellular Activities (ATPases) . 
5 Sequences within SEQ ID NOs:431, 639, 2135, 2684, 2859, 3197 and 3266 correspond 
to a sequence that encodes a novel member of the "ATPases Associated with diverse 
cellular Activities" (AAA) protein family. The AAA protein family is composed of a 
large number of ATPases that share a conserved region of about 220 amino acids that 
contains an ATP-binding site (Froehlich et al., J. Cell Biol (1991) 7 14:443; Erdmann et 

10 al„ Cell (1991) 64:499; Peters et al., EMBO J. (1990) 9:1757; Kunau et al., Biochimie 
(1993) 75:209-224; Confalonieri et al, BioEssays (1995) 77:639; 
http://yeamob.pci.chemie.uni-tuebingen.de/AAA/Description.html). The proteins that 
belong to this family either contain one or two AAA domains. In general, the AAA 
domains in these proteins act as ATP-dependent protein clamps (Confalonieri et al. 

15 (1995) BioEssays 77:639). In addition to the ATP-binding 'A' and 'B' motifs, which are 
located in the N-terminal half of this domain, there is a highly conserved region located 
in the central part of the domain which was used in the development of the signature 
pattern. The consensus pattern is: [LIVMT]-x-[LIVMT]-[LIVMF]-x-[GATMC]-[ST]- 
[NS]-x(4)-[LIVM]-D-x-A-[LIFA]-x-R. 

20 Bromodomain (bromodomain) . SEQ ID NO: 1814 represents a 

polynucleotide encoding a polypeptide having a bromodomain region (Haynes et al., 
1992, Nucleic Acids Res. 20:2693-2603, Tamkun et al., 1992, Cell 68:561-572, and 
Tamkun, 1995, Curr. Opin. Genet. Dev. 5:473-477), which is a conserved region of 
about 70 amino acids. The bromodomain is thought to be involved in protein-protein 

25 interactions and may be important for the assembly or activity of multicomponent 
complexes involved in transcriptional activation. The consensus pattern, which spans a 
major part of the bromodomain, is: [STANVF]-x(2)-F-x(4)-[DNS]-x(5,7)-[DENQTF]- 
Y-[HFY]-x(2)- [LIVMFY]-x(3)-[LIVM]-x(4)-[LIVM]-x(6,8)-Y-x(12,13)-[LIVM]-x(2)- 
N-[SACF]-x(2)-[FY]. 

30 Basic Region Plus Leucine Zipper Transcription Factors (BZIP) . SEQ 

ID NOs:410, 552, 768, 822, 836, 1288, 1365, 1454, 1540, 1549, 1556, 1557, 1563, 
1622, 1630, 1704, 1808, 2363, 2424, 3147, 3152, 3158 and 3208 represent 
polynucleotides encoding a novel member of the family of basic region plus leucine 
zipper transcription factors. The bZIP superfamily (Hurst, Protein Prof (1995) 2:105; 

35 and Ellenberger, Curr. Opin. Struct. Biol (1994) 4:\2) of eukaryotic DNA-binding 
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transcription factors encompasses proteins that contain a basic region mediating 
sequence-specific DNA-binding followed by a leucine zipper required for dimerization. 
The consensus pattern for this protein family is: [KR]-x(l,3)-[RKSAQ]-N-x(2)- 
[SAQ](2)-x-[RKTAENQ]-x-R-x-[RK]. 
5 EF Hand (EFhandX SEQ ID NOs:820, 1755 and 3285 correspond to 

polynucleotides encoding a novel protein in the family of EF-hand proteins. Many 
calcium-binding proteins belong to the same evolutionary family and share a type of 
calcium-binding domain known as the EF-hand (Kawasaki et al., Protein. Prof (1995) 
2:305-490). This type of domain consists of a twelve residue loop flanked on both sides 

10 by a twelve residue alpha-helical domain. In an EF-hand loop the calcium ion is 
coordinated in a pentagonal bipyramidal configuration. The six residues involved in the 
binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y, Z, -Y, 
-X and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding 
Ca (bidentate ligand). The consensus pattern includes the complete EF-hand loop as 

15 well as the first residue which follows the loop and which seem to always be 
hydrophobic: D-x-[DNS]-{ILVFYW}-[DENSTG]-[DNQGHRK]-{GP}-[LIVMC]- 
[DENQSTAGC]-x(2)-[DE]-[LIVMFYW]. 

Ets Domain (Ets Nterm) . SEQ ID NO: 1811 represents a polynucleotide 
encoding a polypeptide with N-terminal homology in ETS domain. Proteins of this 

20 family contain a conserved domain, the "ETS-domain," that is involved in DNA 
binding. The domain appears to recognize purine-rich sequences; it is about 85 to 90 
amino acids in length, and is rich in aromatic and positively charged residues (Wasylyk, 
et al., Eur. J. Biochem. (1993) 277:718). The ets gene family encodes a novel class of 
DNA-binding proteins, each of which binds a specific DNA sequence and comprises an 

25 ets domain that specifically interacts with sequences containing the common core tri- 
nucleotide sequence GGA. In addition to an ets domain, native ets proteins comprise 
other sequences which can modulate the biological specificity of the protein. Ets genes 
and proteins are involved in a variety of essential biological processes including cell 
growth, differentiation and development, and three members are implicated in 

30 oncogenic process. 

G-Protein Alpha Subunit (G-alpha) . SEQ ID NO: 1846 represents a 
polynucleotide encoding a novel polypeptide of the G-protein alpha subunit family. 
Guanine nucleotide binding proteins (G-proteins) are a family of membrane-associated 
proteins that couple extracellularly-activated integral -membrane receptors to 

35 intracellular effectors, such as ion channels and enzymes that vary the concentration of 
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second messenger molecules. G-proteins are composed of 3 subunits (alpha, beta and 
gamma) which, in the resting state, associate as a trimer at the inner face of the plasma 
membrane. The alpha subunit binds GTP and exhibits GTPase activity. G-protein alpha 
subunits are 350-400 amino acids in length and have molecular weights in the range 40- 
5 45 kDa. Seventeen distinct types of alpha subunit have been identified in mammals, 
and fall into 4 main groups on the basis of both sequence similarity and function: alpha- 
s, alpha-q, alpha-i and alpha-12 (Simon et al., Science (1993) 252:802). They are often 
N-terminally acylated, usually with myristate and/or palmitoylate, and these fatty acid 
modifications can be important for membrane association and high- affinity interactions 

1 0 with other proteins. 

Helicases conserved C-terminal domain (helicase CY SEQ ID 
NOs:1496, 2826 and 2871 represent polynucleotides encoding novel members of the 
DEAD/H helicase family. A number of eukaryotic and prokaryotic proteins have been 
characterized (Schmid S.R., et al., Mol Microbiol (1992) (5:283; Linder P., et al., 

15 Nature (1989) 537:121; Wassarman D.A., et ah, Nature (1991) 349:463) on the basis of 
their structural similarity. All are involved in ATP-dependent, nucleic-acid unwinding. 
All DEAD box family members of the above proteins share a number of conserved 
sequence motifs, some of which are specific to the DEAD family while others are 
shared by other ATP-binding proteins or by proteins belonging to the helicases 

20 'superfamily' (Hodgman T.C., Nature (1988) 333:22 and Nature (1988) 353:578 
(Errata). One of these motifs, called the "D-E-A-D-box", represents a special version of 
the B motif of ATP-binding proteins. Some other proteins belong to a subfamily which 
have His instead of the second Asp and are thus said to be "D-E-A-H-box" proteins 
(Wassarman D.A., et ah, Nature (1991) 349:463; Harosh I., et al., Nucleic Acids Res. 

25 (1991) 79:6331; Koonin E.V. et al., J. Gen. Virol. (1992) 75:989. The following 
signature patterns are used to identify members of both subfamilies: 1) [LIVMF](2)-D- 
E- A-D- [RKEN] -x- [LI VMF YGSTN] ; and 2) [GSAH]-x-[LIVMF](3)-D-E-[ALIV]-H- 
[NECR]. 

Homeobox domain Oiomeobox) . SEQ ID NOs:1676, 1820 and 1821 
30 represent polynucleotides encoding proteins having a homeobox domain. The 
homeobox is a protein domain of 60 amino acids (Gehring In: Guidebook to the 
Homeobox Genes , Duboule D., Ed., pp. 1-10, Oxford University Press, Oxford, (1994); 
Buerglin In: Guidebook to the Homeobox Genes , pp25-72, Oxford University Press, 
Oxford, (1994); Gehring, Trends Biochem. ScL (1992) 17:277-280; Gehring et al., 
35 Annu. Rev. Genet. (1986) 20:147-173; Schofield, Trends Neurosci. (1987) 10:3-6) first 
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identified in a number of Drosophila homeotic and segmentation proteins. It is 
extremely well conserved in many other animals, including vertebrates. This domain 
binds DNA through a helix-turn-helix type of structure. Several proteins that contain a 
homeobox domain play an important role in development. Most of these proteins are 
5 sequence-specific DNA-binding transcription factors. The homeobox domain is also 
very similar to a region of the yeast mating type proteins. These are sequence-specific 
DNA-binding proteins that act as master switches in yeast differentiation by controlling 
gene expression in a cell type-specific fashion. 

A schematic representation of the homeobox domain is shown below. 
10 The helix-turn-helix region is shown by the symbols 'H' (for helix), and T (for turn). 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxHHHHHHHHtttHHHHHHHHHxxxxxxxxxx 
1 60 

15 The pattern detects homeobox sequences 24 residues long and spans 

positions 34 to 57 of the homeobox domain. The consensus pattern is as follows: 
[LIVMFYG]-[ASLVR]-x(2)-[LIVMSTACN]-x-[LIVM]-x(4)-[LIV]-[RKNQESTAIY]- 
[LIVFSTNKH]-W-[FYVC]-x-[NDQTAH]-x(5)-[RKNAIMW]. 

MAP kinase kinase (mkk) . SEQ ID NOs:29, 31, 196, 3175, 3190 and 

20 3281 represent novel members of the MAP kinase kinase family. MAP kinases 
(MAPK) are involved in signal transduction, and are important in cell cycle and cell 
growth controls. The MAP kinase kinases (MAPKK) are dual-specificity protein 
kinases which phosphorylate and activate MAP kinases. MAPKK homologues have 
been found in yeast, invertebrates, amphibians, and mammals. Moreover, the 

25 MAPKK/MAPK phosphorylation switch constitutes a basic module activated in distinct 
pathways in yeast and in vertebrates. MAPKKs are essential transducers through which 
signals must pass before reaching the nucleus. For review, see, e.g., Biologique Biol 
Cell (1993) 79:193-207; Nishida et al., Trends Biochem Sci (1993) 75:128-31; 
Ruderman, Curr Opin Cell Biol (1993) 5:207-13; Dhanasekaran et al., Oncogene (1998) 

30 77:1447-55; Kieferetal., Biochem Soc Trans (1997) 25:491-8; and Hill, Cell Signal 
(1996) 5:533-44. 

Protein Kinase fprotkinase) . SEQ ID NOs:l 157, 1478, 1496, 2286, 2969 
and 3190 represent polynucleotides encoding protein kinases. Protein kinases catalyze 
phosphorylation of proteins in a variety of pathways, and are implicated in cancer. 
35 Eukaryotic protein kinases (Hanks S.K., et al., FASEB J. (1995) 9:576; Hunter T., Meth 
Enzymol. (1991) 200:3; Hanks S.K., et al., Meth Enzymol (1991) 200:38; Hanks S.K., 
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Curr. Opin. Struct Biol (1991) 7:369; Hanks S.K. et al., Science (1988) 241 A2) are 
enzymes that belong to a very extensive family of proteins which share a conserved 
catalytic core common to both serine/threonine and tyrosine protein kinases. There are 
a number of conserved regions in the catalytic domain of protein kinases. The first 
5 region, which is located in the N-terminal extremity of the catalytic domain, is a 
glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown 
to be involved in ATP binding. The second region, which is located in the central part 
of the catalytic domain, contains a conserved aspartic acid residue which is important 
for the catalytic activity of the enzyme (Knighton D.R. et al., Science (1991) 253:407). 

10 The protein kinase profile includes two signature patterns for this second region: one 
specific for serine/threonine kinases and the other for tyrosine kinases. A third profile 
is based on the alignment in (Hanks S.K. et al., FASEB J. (1995) 9:576) and covers the 
entire catalytic domain. 

The consensus patterns are as follows: 1) [LIV]-G-{P}-G-{P}- 

15 [FYWMGSTNH]-[SGA]-{PW}-[LIVCAT]-{PD}-x-[GSTACLIVMFY]-x(5,18)- 

[LIVMFYWCSTAR]-[AIVP]-[LIVMFAGCKR]-K, where K binds ATP; 2) 
[LIVMFYC]-x-[HY]-x-D-[LIVMFY]~K-x(2)-N-[LIVMFYCT](3), where D is an active 
site residue; and 3) [LIVMFYC]-x-[HY]-x-D-[LIVMFY]-[RSTAC]-x(2)-N- 
[LIVMFYC], where D is an active site residue. 

20 If a protein analyzed includes two of the above protein kinase signatures, 

the probability of it being a protein kinase is close to 100%. 

Ras family proteins (ras) . SEQ ID NOs:1688 and 3258 represent 
polynucleotides encoding novel members of the ras family of small GTP/GDP-binding 
proteins (Valencia et al., 1991, Biochemistry 30:4637-4648). Ras family members 

25 generally require a specific guanine nucleotide exchange factor (GEF) and a specific 
GTPase activating protein (GAP) as stimulators of overall GTPase activity. Among 
ras-related proteins, the highest degree of sequence conservation is found in four 
regions that are directly involved in guanine nucleotide binding. The first two 
constitute most of the phosphate and Mg2+ binding site (PM site) and are located in the 

30 first half of the G-domain. The other two regions are involved in guanosine binding and 
are located in the C-terminal half of the molecule. Motifs and conserved structural 
features of the ras-related proteins are described in Valencia et al., 1991, Biochemistry 
30:4637-4648. A major consensus pattern of ras proteins is: D-T-A-G-Q-E-K-[LF]-G- 
G-L-R-[DE]-G-Y-Y. 
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Thioredoxin family active site (Thioredox) . SEQ ID NO: 1 677 represents 
a polynucleotide encoding a protein having a thioredoxin family active site. 
Thioredoxins (Holmgren A., Annu. Rev. Biochem. (1985) 54:221 r ; Gleason F.K. et al., 
FEMS Microbiol Rev. (1988) 54:271; Holmgren, A J. Biol. Chem. (1989) 264:13963; 
5 Eklund H. et al., Proteins (1991) 77:13) are small proteins of approximately one 
hundred amino- acid residues which participate in various redox reactions via the 
reversible oxidation of an active center disulfide bond. They exist in either a reduced 
form or an oxidized form where the two cysteine residues are linked in an 
intramolecular disulfide bond. Thioredoxin is present in prokaryotes and eukaryotes 
10 and the sequence around the redox-active disulfide bond is well conserved. All PDI 
contains two or three (ERp72) copies of the thioredoxin domain. The consensus pattern 
is: [LIVMF]-[LIVMSTA]-x-[LIVMFYC]-[FYWSTHE]-x(2)-[FYWGTN]-C- 
[GATPLVE]-[PHYWSTA]-C-x(6)-[LIVMFYWT] (where the two C's form the redox- 
active bond). 

15 Trypsin (trypsin) . SEQ ID NO: 1410 corresponds to a novel serine 

protease of the trypsin family. The catalytic activity of the serine proteases from the 
trypsin family is provided by a charge relay system involving an aspartic acid residue 
hydrogen-bonded to a histidine, which itself is hydrogen-bonded to a serine. The 
sequences in the vicinity of the active site serine and histidine residues are well 

20 conserved in this family of proteases (Brenner S., Nature (1988) 334:52%). The 
consensus patterns for this trypsin protein family are: 1) [LIVM]-[ST]-A-[STAG]-H-C, 
where H is the active site residue; and 2) [DNSTAGC]-[GSTAPIMVQH]-x(2)-G-[DE]~ 
S-G-[GS]-[SAPHV]- [LIVMFYWH]-[LIVMFYSTANQH], where S is the active site 
residue. All sequences known to belong to this family are detected by the above 

25 consensus sequences, except for 18 different proteases which have lost the first 
conserved glycine. If a protein includes both the serine and the histidine active site 
signatures, the probability of it being a trypsin family serine protease is 100%. 

WD Domain. G-Beta Repeats (WD domain) . SEQ IDNOs:1336, 1380, 
1711, 1762, 1909, 2218, 3047, 3108 and 3292 represent novel members of the WD 

30 domain/G-beta repeat family. Beta-transducin (G-beta) is one of the three subunits 
(alpha, beta, and gamma) of the guanine nucleotide-binding proteins (G proteins) which 
act as intermediaries in the transduction of signals generated by transmembrane 
receptors (Gilman, Annu. Rev. Biochem. (1987) 56:6\5). The alpha subunit binds to 
and hydrolyzes GTP; the functions of the beta and gamma subunits are less clear but 

35 they seem to be required for the replacement of GDP by GTP as well as for membrane 
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anchoring and receptor recognition. In higher eukaryotes, G-beta exists as a small 
multigene family of highly conserved proteins of about 340 amino acid residues. 
Structurally, G-beta consists of eight tandem repeats of about 40 residues, each 
containing a central Trp-Asp motif (this type of repeat is sometimes called a WD-40 
5 repeat). The consensus pattern for the WD domain/G-Beta repeat family is: 
[LIVMSTAC]-[LIVMFYWSTAGC]4LIMSTAG]-[LIVMSTAGC]-x(2)-[DN]-x(2)- 
[LIVMWSTAC]-x-[LIVMFSTAG]-W-[DEN]-[LIVMFSTAGCN]. 

wnt Family of Developmental Signaling Proteins (Wnt dev sign) . SEQ 
ID NO: 1538 corresponds to a novel member of the wnt family of developmental 

10 signaling proteins. Wnt-1 (previously known as int-1), the seminal member of this 
family, (Nusse R., Trends Genet. (1988) 4:291) is thought to play a role in intercellular 
communication and seems to be a signalling molecule important in the development of 
the central nervous system (CNS). All wnt family proteins share the following features 
characteristics of secretory proteins: a signal peptide, several potential N-glycosylation 

15 sites and 22 conserved cysteines that are probably involved in disulfide bonds. The 
Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are 
therefore likely to signal over only few cell diameters. The consensus pattern, which is 
based upon a highly conserved region including three cysteines, is as follows: C-K-C- 
H-G-[LIVMT]-S-G-x-C. 

20 Protein Tyrosine Phosphatase (Y phosphatase) . SEQ ID NO:1417 

represents a polynucleotide encoding a protein tyrosine kinase. Tyrosine specific 
protein phosphatases (EC 3.1.3.48) (PTPase) (Fischer et al., Science (1991) 25J:401; 
Charbonneau et al., Annu. Rev. Cell Biol. (1992) 5:463; Trowbridge, J. Biol Chem. 
(1991) 26(5:23517; Tonks et al., Trends Biochem. Set (1989) 14:497; and Hunter, Cell 

25 (1989) 55:1013) catalyze the removal of a phosphate group attached to a tyrosine 
residue. These enzymes are very important in the control of cell growth, proliferation, 
differentiation and transformation. Multiple forms of PTPase have been characterized 
and can be classified into two categories: soluble PTPases and transmembrane receptor 
proteins that contain PTPase domain(s). Structurally, all known receptor PTPases are 

30 made up of a variable length extracellular domain, followed by a transmembrane region 
and a C-terminal catalytic cytoplasmic domain. PTPase domains consist of about 300 
amino acids. The search of two conserved cysteines has been shown to be absolutely 
required for activity. Furthermore, a number of conserved residues in its immediate 
vicinity have also been shown to be important. The consensus pattern for PTPases is: 

35 [LIVMF]-H-C-x(2)-G-x(3)-[STC]-[STAGP]-x-[LIVMFY]; C is the active site residue. 
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Zinc Finger, C2H2 Type (Zincfing C2H2V SEQ ID NOs:308, 807, 
1324, 1503, 1527, 3081, 3193 and 3306 correspond to polynucleotides encoding novel 
members of the of the C2H2 type zinc finger protein family. Zinc finger domains (Klug 
et aL, Trends Biochem. Sci. (1987) 72:464; Evans et ah, Cell (1988) 52:1; Payre et al., 
5 FEES Lett (1988) 234:245; Miller et aL, EMBOJ. (1985) ¥:1609; and Berg, Proc. Natl. 
Acad. Sci. USA (1988) 55:99) are nucleic acid-binding protein structures. In addition to 
the conserved zinc ligand residues, it has been shown that a number of other positions 
are also important for the structural integrity of the C2H2 zinc fingers. (Rosenfeld et aL, 
J. Biomol Struct. Dyn. (1993) 77:557) The best conserved position is found four 

1 0 residues after the second cysteine; it is generally an aromatic or aliphatic residue. The 
consensus pattern for C2H2 zinc fingers is: C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H- 
x(3,5)-H. The two C's and two H's are zinc ligands. 

Src homology 2 . SEQ ID NOs:186 5 2591, 3307 and 3339 represent 
polynucleotides encoding novel members of the family of Src homology 2 (SH2) 

15 proteins. The Src homology 2 (SH2) domain is a protein domain of about 100 amino 
acid residues first identified as a conserved sequence region between the oncoproteins 
Src and Fps (Sadowski I. et aL, Mol. Cell Biol 6:4396-4408 (1986)). Similar sequences 
are found in many other intracellular signal -transducing proteins (Russel R.B. et aL, 
FEBS Lett. 304:15-20 (1992)). SH2 domains function as regulatory modules of 

20 intracellular signalling cascades by interacting with high affinity to phosphotyrosine- 
containing target peptides in a sequence-specific and phosphorylation-dependent 
manner (Marangere L.E.M., Pawson T., J. Cell Sci. Suppl 75:97-104 (1994); Pawson 
T., Schlessinger J., Curr. Biol 3:434-442 (1993); Mayer B.J., Baltimore D., Trends 
Cell Biol. 5:8-13 (1993); Pawson T., Nature 373:573-580 (1995)). 

25 The SH2 domain has a conserved 3D structure consisting of two alpha 

helices and six to seven beta-strands. The core of the domain is formed by a continuous 
beta-meander composed of two connected beta-sheets (Kuriyan J., Cowburn D., Curr. 
Opin. Struct. Biol 3:828-837(1993)). The profile to detect SH2 domains is based on a 
structural alignment consisting of 8 gap-free blocks and 7 linker regions totaling 92 

30 match positions. 

Src homology 3. SEQ ID NO:234, 1832, and 1835 represent 
polynucleotides encoding novel members of the family of Src homology 3 (SH3) 
proteins. The Src homology 3 (SH3) domain is a small protein domain of about 60 
amino acid residues first identified as a conserved sequence in the non-catalytic part of 

35 several cytoplasmic protein tyrosine kinases {e.g., Src, Abl, Lck) (Mayer BJ. et aL, 
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Nature 552:272-275 (1988)). Since then, it has been found in a great variety of other 
intracellular or membrane-associated proteins (Musacchio A. et al., FEBS Lett. 307:55- 
61 (1992); Pawson T., Schlessinger J., Curr. Biol. 3:434-442 (1993); Mayer B.J., 
Baltimore D., Trends Cell Biol 5:8-13 (1993); Pawson T., Nature 575:573-580 (1995)). 
5 The SH3 domain has a characteristic fold which consists of five or six 

beta strands arranged as two tightly packed anti-parallel beta sheets. The linker regions 
may contain short helices (Kuriyan J., Cowburn D., Curr. Opin. Struct. Biol. 5:828-837 
(1993)). 

The function of the SH3 domain may be to mediate assembly of specific 
10 protein complexes via binding to proline-rich peptides (Morton CJ., Campbell I.D., 
Curr. Biol. 4:615-617(1994)). 

In general SH3 domains are found as single copies in a given protein, but 
there are a significant number of proteins with two SH3 domains and a few with 3 or 4 
copies. 

15 Fibronectin type HI. SEQ ID NOs:746 and 1192 represent 

polynucleotides encoding novel members of the family of fibronectin type III proteins. 
A number of receptors for lymphokines, hematopoeitic growth factors and growth 
hormone-related molecules have been found to share a common binding domain. 
(Bazan J.F., Biochem. Biophys. Res. Commun. 1 64:788-795 (1989); Bazan J.F., Proc. 

20 Natl Acad. Set U.S.A. (57:6934-6938 (1990); Cosman D. et al., Trends Biochem. Sci. 
75:265-270 (1990); d' Andrea A.D., Fasman G.D. 5 Lodish H.F., Cell 55:1023-1024 

(1989) ; d' Andrea A.D., Fasman G.D., Lodish H.F., Curr. Opin. Cell Biol 2:648-651 

(1990) ). 

The conserved region constitutes all or part of the extracellular ligand- 
25 binding region and is about 200 amino acid residues long. In the N-terminal of this 
domain there are two pairs of cysteines known, in the growth hormone receptor, to be 
involved in disulfide bonds. 

+ xxxxxxx + 

30 |C C C C Extracellular XXXXXXX Cytoplasmic | + 

- 1 - 1 | - - 1 xxxxxxx + 

|| II Transmembrane 

+-+ +- -+ 

35 Two patterns detect this family of receptors. The first one is derived 

from the first N-terminal disulfide loop, the second is a tryptophan-rich pattern located 
at the C-terminal extremity of the extracellular region. 
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A consensus for this protein family is: C-[LVFYR]-x(7,8)-[STIVDN]-C- 
x-W (The two Cs are linked by a disulfide bond]. A second consensus for this protein 
family is: [STGL]-x-W-[SG]-x-W-S. 

LIM domain containing proteins. SEQ ID NOs:1269, 1309, 1360, and 
5 1386 represent polynucleotides encoding novel members of the family of LIM domain 
containing proteins. A number of proteins contain a conserved cysteine-rich domain of 
about 60 amino-acid residues. (Freyd G. et al., Nature 544:876-879 (1990); Baltz R. et 
al., Plant Cell 4:1465-1466 (1992); Sanchez-Garcia L, Rabbitts T.H., Trends Genet 
70:315-320(1994)). 

10 In the LIM domain, there are seven conserved cysteine residues and a 

histidine. The arrangement followed by these conserved residues is C-x(2)-C- x( 16,23)- 
H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD]. The LIM domain binds two zinc 
ions (Michelsen J.W. et al., Proa Natl. Acad. Sci. U.S.A. 90:4404-4408 (1993)). LIM 
does not bind DNA, rather it seems to act as interface for protein-protein interaction. 

15 The consensus for this protein family is: C-x(2)-C-x(15,21)-[FYWH]-H-x(2)-[CH]- 
x(2)-C-x(2)-C-x(3)- [LIVMF]. The 5 Cs and the H bind zinc. 

C2 domain (protein kinase C like). SEQ ID NOs:1325 and 2282 
represent polynucleotides encoding novel members of the family of C2 domain 
containing proteins. Some isozymes of protein kinase C (PKC) contain a domain, 

20 known as C2, of about 116 amino-acid residues, which is located between the two 
copies of the CI domain (that bind phorbol esters and diacylglycerol) and the protein 
kinase catalytic domain. (Azzi A. et al., Eur. J. Biochem. 208:547-557 (1992); Stabel S., 
Semin. Cancer Biol. 5:277-284 (1994)). 

The C2 domain is involved in calcium-dependent phospholipid binding 

25 (Davletov B.A., Suedhof T.C., J. Biol. Chem. 265:26386-26390 (1993)). Since 
domains related to the C2 domain are also found in proteins that do not bind calcium, 
other putative functions for the C2 domain include binding to inositol- 1,3,5- 
tetraphosphate. (Fukuda M., et al., J. Biol Chem. 269:29206-29211 (1994).) 

The consensus pattern for the C2 domain is located in a conserved part 

30 of that domain, the connecting loop between beta strands 2 and 3. The profile for the C2 
domain covers the total domain. The consensus for this protein family is:: [ACG]-x(2)- 
L-x(2,3)-D-x(l,2)-[NGSTLIF]-[GTMR]-x-[STAP]-D- [PA]- [FY] 

Serine proteases, trypsin family, active sites. SEQ ID NO:1410 
represents a polynucleotide encoding a novel member of the family of serine protease, 

35 trypsin proteins. The catalytic activity of the serine proteases from the trypsin family is 
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provided by a charge relay system involving an aspartic acid residue hydrogen-bonded 
to a histidine, which itself is hydrogen-bonded to a serine. The sequences in the vicinity 
of the active site serine and histidine residues are well conserved in this family of 
proteases (Brenner S., Nature J54:528-530 (1988)). 
5 A consensus for this protein family is: [LIVM]-[ST]-A-[STAG]-H-C [H 

is the active site residue]. A second consensus for this protein family is: [DNSTAGC]- 
[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV]- [LIVMFYWH]- 
[LIVMFYSTANQH] [S is the active site residue]. 

RNA Recognition Motif Domain (RRM RBD, or RNP). SEQ ID NOs: 
10 1464 and 1514 represent polynucleotides encoding novel members of the family of 
RNA recognition motif domain proteins (Bandziulis R.J. et al., Genes Dev. 3:431-437 
(1989); Dreyfuss G. et al., Trends Biochem. Set 75:86-91 (1988)). 

Inside the putative RNA-binding domain there are two regions which are 
highly conserved. The first one is a hydrophobic segment of six residues (which is 
15 called the RNP-2 motif); the second one is an octapeptide motif (which is called RNP-1 
or RNP-CS). The position of both motifs in the domain is shown in the following 
schematic representation: 

xxxxxxx######xxxxxxxxxxxxxxxxxxxxxxxxxxxxx######xxxxxxxxxxxxxxxxxxxxxx 
20 RNP-2 RNP-1 

As a consensus pattern for this type of domain the RNP-1 motif was 
used. The consensus for this protein family is: [RK]-G-{EDRKHPCG}-[AGSCI]- 
[FY]-[LIVA]-x-[FYLM] 

25 Phosphatidvlinositol-speciflc phospholipase C, Y Domain. SEQ ID NO: 

1707 represents a polynucleotide encoding a novel member of the phosphatidylinositol- 
specific phospholipase C, Y domain family of proteins. Phosphatidylinositol-speciflc 
phospholipase C (EC3.1.4.1 1), a eukaryotic intracellular enzyme, plays an important 
role in signal transduction processes (Meldrum E. et al., Biochim. Biophys. Acta 

30 1092:49-71 (1991)). It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol- 
3,4,5- triphosphate into the second messenger molecules diacylglycerol and inositol- 
1,4,5-triphosphate. This catalytic process is tightly regulated by reversible 
phosphorylation and binding of regulatory proteins (Rhee S.G., Choi K.D., Adv. Second 
Messenger Phosphoprotein Res. 25:35-61 (1992); Rhee S.G., Choi K.D., J. Biol Chem. 

35 267:12393-12396 (1992); Sternweis P.C., Smrcka A.V., Trends Biochem. Sci. 7 7:502- 
506 (1992)). 
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All eukaryotic PI-PLCs contain two regions of homology, referred to as 
"X-box" and "Y-box". The order of these two regions is the same (NH2-X-Y-COOH), 
but the spacing is variable. In most isoforms, the distance between these two regions is 
only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, 
5 and one SH3 domain are inserted between the two PLC-specific domains. The two 
conserved regions have been shown to be important for the catalytic activity. At the C- 
terminal of the Y-box, there is a C2 domain possibly involved in Ca-dependent 
membrane attachment. 

Serine Carboxvpeptidases. SEQ ID NO: 1744 represents a 

10 polynucleotide encoding a novel member of the serine carboxypeptidases family of 
proteins. Carboxypeptidases may be either metallo carboxypeptidases or serine 
carboxypeptidases (EC 3.4.16.5 and EC 3.4.16.6). The catalytic activity of the serine 
carboxypeptidases, like that of the trypsin family serine proteases, is provided by a 
charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, 

15 which is itself hydrogen-bonded to a serine (Liao D.I., Remington S.J., J. Biol Chem. 
265:6528-6531 (1990)). 

The sequences surrounding the active site serine and histidine residues 
are highly conserved in all these serine carboxypeptidases. A consensus for this protein 
family is: [LIVM]-x-[GTA]-E-S-Y-[AG]-[GS] [S is the active site residue]. A second 

20 consensus for this protein family is: [LIVF]-x(2)-[LIVSTA]-x-[IVPST]-x-[GSDNQL]- 
[SAGV]-[SG]-H-x- [IVAQ]-P-x(3)-[PSA] [H is the active site residue]. 

dsrm Double-Stranded RNA Binding Motif. SEQ ID NO:1818 
represents a polynucleotide encoding a novel member of the dsrm double-stranded 
RNA binding motif proteins. In eukaryotic cells, a multitude of RNA-binding proteins 

25 play key roles in the posttranscriptional regulation of gene expression. Characterization 
of these proteins has led to the identification of several RNA-binding motifs. Several 
human and other vertebrate genetic disorders are caused by aberrant expression of 
RNA-binding proteins. (C. G. Burd & G. Dreyfuss, Science 265: 615-621 (1994)). 

Proteins containing double stranded RNA binding motifs bind to specific 

30 RNA targets. Double stranded RNA binding motifs are exemplified by interferon- 
induced protein kinase in humans, which is part of the cellular response to dsRNA. 

SEQ ID NOs:2577, 3183 and 3195 encode members of the 4 trans- 
membrane integral membrane protein family. This family consists of type III proteins, 
which are integral membrane proteins that contain a N-terminal membrane-anchoring 

35 domain that is not cleaved during biosynthesis, and which functions as a translocation 
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signal and a membrane anchor. The proteins also have three additional transmembrane 
regions. The consensus pattern is: G-x(3)-[LIVMF]-x(2)-[GSA]-[LIVMF] (2)-G-C-x- 
[GA]-[STA]-x(20-[eG]-x(20-[CwN]-[LIVM](2). 

SEQ ID NO:2944 encodes a polypeptide having a calpain large subunit, 
5 domain III. Calpains are a family of intracellular proteases that play a variety of 
biological roles. Calpain 3 ? also known as p94, is predominantly expressed in skeletal 
muscle and plays a role in limb-girdle muscular dystrophy type 2A. (Sorimachi, H. et 
al„ Biochem. J. 328:721-732, 1997). 

SEQ ID NOs:191 1 and 1980 encode polypeptides having a C3HC4 type 

1 0 zinc finger domain (RING finger), which is a cysteine-rich domain of 40 to 60 residues 
that binds two atoms of zinc, and is believed to be involved in mediating protein-protein 
interactions. Mammalian proteins of this family include V(D)J recombination 
activating protein, which activates the rearrangement of immunoglobulin and T-cell 
receptor genes; breast cancer type 1 susceptibility protein (BRCA1); bmi-1 proto- 

15 oncogene; cbl proto-oncogene; and mel-18 protein, which is expressed in a variety of 
tumor cells and is a transcriptional repressor that recognizes and binds a specific DNA 
sequence. The consensus pattern is: C-x-H-x-[LIVMFY]-C-x(2)-C-[LIVMYA]. 

SEQ ID NO:3274 encodes a eukaryotic transcription factor with a fork 
head domain, of about 100 amino acid residues. Proteins of this group are transcription 

20 factors, including mammalian transcription factors HNF-3 -alpha, -beta, and -gamma; 
interleukin-enhancer binding factor; and HTLF, which binds to a region of human T- 
cell leukemia virus long terminal repeat. The consensus pattern is [KR]-P-[PTQ]- 
[FYLVQH]-S-[FY]x(2)-[LIVM]-X(3,4)-[AC]-[LIM]. 

SEQ ID NO:3345 encodes a polypeptide having a PDZ domain. Several 

25 dozen signaling proteins belong to this group of proteins that have 80-100 residue 
repeats known as PDZ domains. Several of the proteins interact with the C-terminal 
tetrapeptide motifs X-Ser/Thr/X-Val-COO- of ion channels and/or receptors. (Ponting, 
C. P., Protein Sci. 6;464-468, 1997.) 

SEQ ID NO:3351 encodes a polypeptide in the family of phorbol 

30 esters/glycerol binding proteins. Phorbol esters (PE) are analogues of diacylglycerol 
(DAG) and potent tumor promoters. DAG activates a family of serine-threonine protein 
kinases, known as protein kinase C. The N-terminal region of protein kinase C binds 
PE and DAG, and contains one or two copies of a cysteine-rich domain of about 50 
amino acid residues. Other proteins having this domain include diacylglycerol kinase; 

35 the vav oncogene; and N-chimaerin, a brain-specific protein. The DAG/PE binding 
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domain binds two zinc ions through the six cysteines and two histidines that are 
conserved in the domain. The consensus pattern is: H-x-[LIVMFYW]-x(8, 1 l)-C-x(2)- 
C-x-(3)-[LIVMFC]-x(5, 10)»C-x(2)-C-x(4)-[HD]-x(2)-C-x(5 3 9)-C. 



5 domain. The protein is named for the presence of conserved aromatic positions, 
generally tryptophan, as well as a conserved proline. Proteins having the domain 
include dystrophin, vertebrate YAP protein, and IQGAP, a human GTPase activating 
protein which acts on ras. The consensus pattern is: W-x(9,l l)-[VFY]-[FYW]-x(6,7)- 
[GSTNE]-[GSTQCR]-[FYW]-x(2)-P. 
10 SEQ ID NO:2428 encodes a member of the dual specificity phosphatase 

family, having a catalytic domain, and SEQ IDS NOs:2281 and 2310 encode members 
of the protein tyrosine phosphatase family. These families are related and classified as 
tyrosine specific protein phosphatases. The enzymes catalyze the removal of a 
phosphate group from a tyrosine residue, and are important in the control of cell growth, 



1 5 proliferation, differentiation, and transformation. The consensus pattern is [LIVMF]-H- 
C-x(2)-G-x-(3)-[STC]-[STAGP]-x-[LIVMFY]. 



SEQ ID NO:2216 encodes a polypeptide having a WW/rsp5/WWP 
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bhQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATIOl* 


4 CLONE ID 


LIBRARY. 




I 


377044 


RTA0O002676F.p.l l.Z.P.Seq 


F 


i M00039329A.C01 


CH09LNL 




2 


377708 


RTAOO0O2633F.m.0L2.P.Sec 


1 F 


M00040039A:G08 


CH09LNL 




3 


427782 


RTA00002666F.1.06. 1 .P.Seq 


F 


M00032633D:A06 


CH08LNH 




4 


29372 


RTA000027 1 2F.a.06. 1 .P.Seq 


F 


M00023282A:C02 


CH04MAL 




5 


455003 


RTA00002694F.b.02. ! .P.Seq 


F 


M000434I9D:AI0 


CH20COHLV 




6 


380625 


RTA00002634F.d.03.2.P.Seq 


F 


M00040I 13D:G10 


CH09LNL 




7 


450959 


RTA00002691 F.b. 05.3. P.Seq 


F 


M00043306D:B07 


CH t 7COHLV 




3 


39735! 


RTA00002680F.b.04. 1 .P.Seq 


F 


M00039775A:A09 


CH09LNL 




9 


20652 


RTA000027IOF.k.01.I.P.Seq 


F 


M00022440B:EOl 


CH03MAH 




10 


97330 


RTA00002663F.k. 1 3. ! .P.Seq 


F 


M00022767B:G 1 1 


CH03MAH 




1 1 


373071 


RTA00002670FJ.23. 1 .P.Seq 


F 


M00033442A:D06 


CH09LNL 




12 


162369 


RTA000027 1 3 F.e.O 1 . 1 .P.Seq 


F 


M00027292D:F10 


CH04MAL 




13 


401247 


RTA00002685F.f. 15.2. P.Seq 


F 


M00039508A:C12 


CH12EDT 




14 


430738 


RTA00002669F.i.l5.3-.P.Seq 


F 


M0003325 I D:B09 


CH08LNH 




15 


46779 


RTA0000271 ( F.c. 14. 1 .P.Seq 


F 


M00022860C:G04 


CH03MAH 




16 


375772 


RTA0000268 1 F.p.0 1 .2. P.Seq 


F 


M00039909CG05 


CH09LNL 




17 


430639 


RTA0O0O2669F.J.0 1 .3. P.Seq 


F 


M00033243B:A05 


CH08LNH 




13 


376546 


RTA00002677F.d.07.2.P.Seq 


F 


M00039545C:C12 


CH09LNL 


19 


430041 


RTA00002667F.f. 1 7. 1 .P. Sea 


F 


M00032790B:AO7 


CHOSLNH 


20 


43 1 643 


RTA00002669F.1. 16.!. P.Seq 


F 


M00053276D-H09 


CH08LNH 


21 


19422 


RTA00002709F.C.02. 1 .P.Seq 


F 


M00005449B:B10 


CH02COH 


22 


376802 


RTA00002677F.C. ! 8.2. P S*q 


F 


M00039344B:G07 


CH09LNL 


23 


376814 


RTA00002674F.h.02. 1 .P.Seq 


F 


M00039I39C:G!2 


CHO°LNL 


24 


375492 


RTA0000::6"7F.m.I9.2.P.Seq 


F 


M000394I33:D08 


CH09LNL 


25 


379! 14 


RTA0000263 i F.n.24.2.P.Seq 


F 


M00039903C:F03 


CHO°LNL 


26 


380668 


RTA00002670F.p.l i.l. P.Seq 


F 


M0O03353iC:HIO 


CH09LNL 




215817 


RTA00002664F.!. 1 9. 2. P.Seq 


F 


M00027634A:Di 1 


CK04MAL 


23 


37574Q 


RTA00002680F.f.23.1. P.Seq 


F 


M00039795D:G06 


CHO°LNL 


29 


430396 


RTA00002669F.b.20.4.P.Seq 


F 


MO0O33lSiC:DOI 


CHOSLNH 


30 


380462 


RTA00002670F.O.0 1 . 1 .P.Seq 


F 


M000335~OB:E06 


CHO<iLNL 


3 1 


430396 


RTA00002669F.b.20.3. P.Seq 


F 


M00033 !S5C:D0! 


CHOSLNH 


32 


376996 


RTA00002676F.p. 1 3.2. P.Seq 


F 


M00039329C:B10 


CHO^LNL 


33 


374346 


RTA00002b77F.k. 1 9.2. P.Seq 


F 


M00039412D:G06 


CHOOLNL 


34 


379075 


RTA00002672F.n. 1 3. 2. P.Seq 


F 


M0003905 C >B:E03 


CHO°LNL 


35 


374172 


RTA00002673F.k. 16. 2. P.Seq 


F 


M00039097D:D06 


CH09LNL 


36 


373104 


RTA00002633F.0. 15.2. P.Seq 


F 


M000400^SD:G12 


CHOOLNL 


37 


186302 


RTAOOOO"* 7 [ 3 F.m." 1 1 . 1 .P.Seq 


F 


M00027?o| B:C04 


CH04MAL 


38 


427947 


RTA00002665F.O.0 1 . 1 .P.Seq 


F 


M000324Of B:D02 


CHOSLNH 


39 


375130 


RTA00002673F.d.I 7.1. P.Seq 


F 


M0003 l )0c4D:H09 


CHOOLNL 


40 


377584 


RTA0OO02633F. 1.2^. \ P.Seq 


F 


M000400SSC.E10 


CHOOLNL 


*+ 1 


J / / J 04 


RTA00002o r SF.a.l5.2.P.Seq 


F 


M00039452C AO ! 


CHOOLNL 


42 


37634- 


RTA00002675F.1.08. i .P. Sea 


F 


M000?O24OC;Gl 1 


CHOOLNL 


43 


446747 


RTA000026SQF.J. 16.2. P.Seq 


F 


M00042740A:E09 


CKI5CON 


44 


28092 


RTA00002"! I F. a. 12. 1 .P.Seq J F 


M00023052A:B05 


CH05MAH 


45 


378206 


RTA00OO267 ! F.j.20.3. P.Seq 


F 


M000335SSC.G04 


CHOOLNL 


46 


378206 


RTA00002cTIF.a.20.2. P.Seq 


F 


M000355SSC:G04 


CHO°LNL 


4T* 


14940 


RTAOC002~09F.J.I i.l P. Sea 


F 


M00005623A:G02 | CH02COK 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATE 


I CLONE ID 


LIBRARY 


48 


3784 1 1 


RTA00002672F.g. i 3.2. P.Seq 


F 


M00039004B:A06 


CH09LNL 


| 49 


38120 


RTA00002712F.L 14. LP.Seq 


F 


M00026927D:F02 


CH04MAL 


50 


375730 


RTA00002678F.1. 1 3.2.P.Seq 


F 


M00039612B:G05 


CH09LNL 


51 


428959 


RTA00002667F.h. I 5. 1 .P.Seq 


F 


M0003281 1B:D02 


CH08LNH 


52 


376851 


RTA00002677F.C.03.2. P.Seq 


F 


M00039341C:H1 1 


CH09LNL 


53 


373808 


RTA0000267 1 F.d. 1 4.2. P.Seq 


F 


M00038272A:G0! 


CH09LNL 


54 


376168 


RTA00002675F.n. 1 7. 1 .P.Seq 


F 


iM00039258B:E06 


CH09LNL 


55 


18653 


RTA00002712F.O.08. 1 .P.Seq 


F 


M00027135A:B1 1 


CH04MAL 


56 


187632 


RTA00002664F.U 5. LP.Seq 


F 


M000276I7B:C12 


CH04MAL 


57 


374122 


RTA00002673F.1.22. 1 -P.Seq 


F 


M00039104D:C09 


CH09LNL 


58 


374946 


RTA00002673F.j.24. LP.Seq 


F 


M00039096A.E07 


CH09LNL 


59 


375666 


RTA00002677F.n. 1 6.2. P.Seq 


F 


M00039422D:F04 


CH09LNL 


60 


162369 


RTA000027l3F.d.24.1. P.Seq 


F 


M00027292D:F10 


CH04MAL 


61 


i 21480 


RTA00002709F.C. 1 8.2. P.Seq 


F 


M0000553ID:F06 


CH02COH 


62 


18560 


RTA0000271 1 F.e.20. LP.Seq 


F 


M0002293SB:F07 


CH03MAH 


63 


96575 


RTA00002663F.J.08. 1 .P.Seq 


F 


M0002264ICH05 


CH03MAH 


64 


377576 


RTA00002682F.f. 1 8. LP.Seq 


F 


M00039975C:C1 1 


CH09LNL 


65 


446747 


RTA00002689F.d. 16.3. P.Seq 


F 


M00042740A:E09 


CH15CON 


66 


37931 1 


RTA00002682F.g.O ! . 1 .P.Seq 


F 


M00039976D:A12 


CH09LNL 


67 


37931 1 


RTA00002682F.f.24. 1 .P.Seq 


F 


iM00059976D:A12 


CH09LNL 


68 


124549 


RTA00002713F.C.07.1. P.Seq 


F 


M00027237C.B08 


CH04MAL 


69 


449785 


RTA00002691F.c.07.3.P.Seq 


F 


M00043345C:A06 


CH17COHLV 


70 


375134 


RTA0O0O"673F.k.22. \P.Seq 


F 


M00039099A:HOS 


CH09LNL 


71 


186593 


RTA000027 1 3F.n. 15. 1 .P.Seq 


F 


M00027620D:F1 1 


CH04MAL 


72 


44983 1 


RTA00002691F.a.I7.3.P.Seq 


F 


M000425I8D:A06 


CHI7COHLV 


73 


379678 


RTA00002676F.b.06. 1 .P.Seq 


F 


M00039274B:G07 


CH09LNL 


74 


20599 


RTA00002708F.h. 06. i .P.Seq 


F 


M00004264B:A05 


CHOICOH 


75 


41115 


RTA00002713F.O.1 1.1. P.Seq 


F 


M00027652B:F1 i 


CH04MAL 


76 


21 109 


RTA0000^708F.h.!2.l.P.Seq 


F 


M00004278A:F09 


CHOICOH 


77 


455702 


RTA00002694F.b. 1 1 . 1 .P.Seq 


F 


M00043433C:G07 


CH20COHL V 


78 


380643 


RTA00002683F.p.09.2. P.Seq 


F 


M00040103B:HI0 


CH09LNL 


79 


374413 


RTA00002672F.Ll5.2.P.Seq 


F 


M00039015B:G10 


CHO^LNL 


80 


378891 


RTA00002672F.i. 18.2. P.Seq 


F 


M00039016A:A02 


CH09LNL 


81 


379374 


RTA00002672F.k.l 1.2.P.Sea 


F 


M0003902SC:B1 I 


CHO^LNL 


82 


17253 


RTA00002709F.h.23. 1 .P.Seq 


F 


M00006866A:D07 


CH02COH 


83 


21565 


RTA00002709F.e. 11.1. P.Seq 


F 


t\10000577SB:F09 


CH02COH 


84 


373996 


RTA00002673F.n.l 1 . 1 .P.Seq 


F 


M00039108D:B06 


CH09LNL 


85 


380437 


RTA00002683F.C.09. 1 .P.Seq 


F 


M00040039D:D06 


CH09LNL 


86 


430729 


RTA00002669F.h. 1 8.2. P.Seq 


F 


M00033226A:A1 1 


CHOSLNH 


87 


376791 


RTA00002674F.L 17.1. P.Seq 


F 


M00039166B:G06 


CH09LNL 


88 


373760 


RTA00002672F.p.20. LP.Seq 


F 


M00039049D:G07 


CH09LNL 


89 


373837 


RTA00002672F.p.22. LP.Seq 


F 


M00039050A:H10 


CH09LNL 


90 


376435 


RTA00002678F.h. 1 7.2. P.Seq 


F 


M00039476B:A02 


CH09LNL 


91 


373881 


RTA00002672F.b.20. LP.Seq 


F 


M0003863SD:H03 


CH09LNL 


92 


377086 


RTA00002676F.p.07. 1 .P.Seq 


F 


M00039328D:D07 


CHO^LNL 


93 


377889 


*TA00002672F.c.08. 1 .P.Seq 


F 


M00038661A:A07 


CH09LNL 


94 


380442 


RTA00002684F.b.05.2.P.Seq 


F 


M000401 1 !C:D05 


CHO°LNL 
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K I A000026 / oF.m. 1 2. P. Sec 


F 
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CH09LNL 


7U 


J / JJjV 
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E7 

F 


M000396I6A:BIO 


CH09LNL 


QO 


14 IV / 


K I A0U002 / ! Ur.t. 1 3. 1 .P.Seq 


F 


M00022084D:BOI 


CH03MAH 


QS 

70 


joUooo 


K 1 A00002oo4r c.U4.2.P.Seq 


F 


M000401 15B:H12 


I CH09LNL 


00 

77 


J / / JJZ 


K 1 AUUUU20 / /r .1. 1 _>.2.P.Seq 


F 


M00039404B:A05 


CH09LNL 


1 uu 


J /V I oo 


Ri A00002oo2r .a.Oj. 1 .P.Seq 


F 


M00039914D:G12 


CH09LNL 


i n i 


42o2ov 


D T A A AAA7 £L£L£ C ^.17 1 FJ C .. ^. 

K 1 AUUUU2ooor c. 1 1 .P.Seq 


r- 

F 


M00032539B:C1 1 


CH08LNH 


1 AO 


*io~* i £ t 


K I AUUUU26 / 1 r .1. 1 j.j. P.Seq 


F 


M00038327A:CI I 


CH09LNL 


1 A 7 


1 J j27 


DTA AAAAT7 I AC — . A"7 1 r» C ^ 

K 1 AUUUU27 I (Jr. p. 07. 1 .P.Seq 


F 


M00022747D:E03 


CH03MAH 


1 Oil 


J / / 3U4 


D T a a aaao £7t r ; t t "» r» r .„ 

K 1 AUUUU2o / 1 r.i. I /.j.P.beq 


F 


M00038303C:DO2 


CH09LNL 


1 A<, 




DTA AAAAT7 1 rtC ™ 1*7 t n 

K I A0UUU2 / I U r .g. 1 /.l.P.Seq 


F 


M00022183B:C02 


CH03MAH 


i a^ 


1 OQ 1 OO 

I2y 1 /V 


K 1 AU<JUU2oo2r .u. 1 V.2.P.beq 


F 


M00007157C:F1 1 


CH02COH 


1 AO 


OOOAQ/C 
J / /UoO 


DTA AAAA^XO/CD AO ") D C ^. 

K I AUUUU^o / or .p.U / .2.P.beq 


F 


M00039328D:D07 


CH09LNL 


1 Uo 


J /jo /2 


DTA A AAATiC "7 C C U ! C 1 AC . „ 

K 1 AUUUU20 /Dr .n. 1 0. 1 .P.beq 


F 


M00039233A;A03 


CH09LNL 


IUV 




D T A A A AAT£ Id IT ; A*7 ** r> C ^. 

K I AUUUU2o76r .i.U7._).P.5eq 


F 


M00039303C:F1 1 


CH09LNL 




J74266 


D T A A A AA1 C ~7 A C* I /A O r> C 

R I A0UU02674r .1.08.2. P. 5eq 


F 


M00039I44C:E06 


CH09LNL 


i i i 


"5 "TOO O 

37898.? 


RTA00002682F.a.07. 1 P.Seq 


F 


M000399 1 5D:C I I 


CH09LNL 


! 1 O 
1 12 


377343 


RTA00002684F.g.04. i .P.Seq 


F 


M00040302CA04 


CH09LNL 


1 I J 


378679 


RTA0000268 1 F.t. 16. 2. P.Seq 


F 


M00039869B:F06 


CH09LNL 


1 1 A 
1 14 


374095 


RTA0000267 I F. p. 08. 2. P.Seq 


F 


M00038618C:C08 


CH09LNL 




375843 


RTA000026/ 1 F.o.06.2.P.Seq 


F 


M00038614C:H1 I 


CHO^LNL 


i 1 o 


377788 


R I A0000-i6o4F.n.0 1 .2. P.Seq 


F 


M00040305C:H06 


CH09LNL 


! 1 O 


2 1 40 j 


I> T A AAAAT7AAC '. AC i n r 

R I A00002709F.j.O^. 1 .P .^eq 


F 


M0000692SD:D07 


CK02COH 


I I O 


2 j 1 84 


DT \ A A AA1 "7 A A C U AC "1 A C 

K I A0UUU2 /09r. o.O j. 2. P.Seq 


F 


M00005358B:B06 


CH02COH 


1 1 O 


I jo7 I 


DT A AAAAT7 I AC 1, 1 1 n C 

R 1 A000U27 lOr.K. lo. I .P.Seq 


F 


M00022495D:H08 


CH03MAH 


1 OA 


1 / / JO / 


DT \ AAAAO^/i 7 C OO t D c 

K 1 AUUUU2oo^ r .m.22. 1 .P.Seq 


F 


M00022986D:H09 


CH03MAH 


1 O 1 


"2 0*7*70 O 
J / / / OO 


DT A AAAAT^ O 1C ^» "Kt ] r> C-.^. 

R I AUUUU2o84r g.24. 1 .P.Seq 


F 


M00040305OH06 


CH09LNL 


I J. J. 


J / JUJ5 


OTA AAAAt^^CT U AO 1 r> c . ^ 

R I A0U0026 / Dr .fl.02. 1 .P.Seq 


F 


M00039250D:Gi2 


CH09LNL 


1 OO 
12J 


"2 q a ,i n 
_><5U4 I 2 


DTA AAA AO £OACI» 1 C 1 n C 

R I AUUUU26o0r .k. 13. 2. P. Sea 


F 


M00039816B:D04 


CH09LNL 


I O/i 
I 24 


1 78447 


D T A A A Am £L £. "* t? — C\C 1 n C 

R rA000U2ooj r .n.Oo. I .P.Seq 


F 


M00023007A:H04 


CH03MAH 


i O s 


j /oo47 


D T A A A AA1 £L "7 ,1 C U A *? t P> C 

R 1 A0UUU2o /4r n.U /. I .P.Seq 


F 


M00039140D:D09 


CH09LNL 


l O/; 
I 20 


/I f "7 A 

44o /V 


dtaaaaao^^ic^. in i r> . „ 

R I AUUUU2oo 1 r .e. IV. 1 .P.Seq 


F 


M00003S00A:F09 


CHOICOH 


I OO 
12 / 


j / /oj9 


DTA AAAAO /;70C n. A/1 ~> AC 

R 1 AUUUU20 / or .a.U4.2. P.Seq 


F 


M0003°430B:F12 
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RTA0000268 1 F.L07.2.P.Seq 


F 


M00039879D:Bt 1 


CH09LNL 


243 


373862 


RTA0000267 1 F.g.0 1 .2. P.Seq 


F 


M00038284B:H04 


| CH09LNL 


244 


373252 


RTA00002670F.k. 1 6. 1 .P.Seq 


F 


M00033451A:HOI 


CH09LNL 


245 


378475 


RTA00002672F.g.24. 1 .P.Seq 


F 


M00039006D:B0I 


CH09LNL 


246 


379941 


RTA00002682FJ. 15. ! .P.Seq 


F 


M00039990C.D10 


CH09LNL 


247 


427703 


RTA00002665F.e. 1 1 . i .P.Seq 


F 


MO0028357A:G10 


CH08LNH 


248 


373976 


RTA0000267 1 F.p. 1 5. 2. P.Seq 


F 


M00038619B:A03 


CH09LNL 


249 


43 1643 


RTA00002669F.I. I6.3 : P.Seq 


F 


M00033276D:H09 


CH08LNH 


250 


383502 


RTA00O02670F.k.O7. 1 .P.Seq 


F 


M00033446D:B02 


CH09LNL 


251 


378764 


RTA0000268 1 F.j.04. 1 P.Seq 


F 


M00039884A:H1 I 


CH09LNL 


252 


43 1629 


RTA00002669F.1. 1 4. 3. P.Seq 


F 


M00033276B:G08 


CH08LNH 


253 


372992 


RTA0000267 1 F.b. 1 6.2. P.Seq 


F 


M00033594CB03 


CH09LNL 


254 


431601 


RTA00002669F.k.08.3.P.Seq 


F 


M00033263B:G04 


CH08LNH 


255 


21059 


RTA000027 1 OF.c.05. 1 .P.Seq' 


F 


MOO0OS053A:F10 


CH03MAH 


256 


430689 


RTA00002669F.i.24.3.P.Seq 


F 


M0O033243B:AO5 


CH08LNH 


257 


1 3 1 764 


RTA00002662F.C. 14.1. P.Seq 


F 


M00006893C:E07 


CH02COH 


258 


373300 


RTA00002674F.c.21.2.P.Seq 


F 


M00039126D:A08 


CH09LNL 


259 


384601 


RTA00002670F.k.06. 1 .P.Seq 


F 


M00033446C:G08 


CHOOLNL 


260 


375389 


RTA00002674F.a. 1 3. 2. P.Seq 


F 


M00039120C:C09 


CHO^LNL 


261 


15248 


RTA00002 7 10F.f.23. 1 .P.Seq 


F 


M00022127C:H03 


CH03MAH 


262 


428 134 


RTA00002666F.C. I 5. 1 .P.Seq 


F 


M00032540A:A09 


CH08LNH 


263 


37418-i 


RTA00002672F.a. 1 9. 1 .P.Seq 


F 


M00038633A:D07 


CH09LNL 


264 


136225 


RTA00002676F.n.02.2.P.Seq 


F 


M00039319C:A04 


CHO^LNL 


265 


401713 


RTA00002685F.p. 10. 1. P.Seq 


F 


M00039647A:H1 1 


CH12EDT 


266 


27104 


RTA0000266 I F.a.09. 1 .P.Seq 


F 


M00001 363 D:D09 


CH01COH 


267 


207466 


RT A00002664 F J .08 . 2 . P. Seq 


F 


M00027733A:A02 


CH04MAL 


268 


143045 


RTA00002663F.a.02.l.P.Seq 


F 


M00007941D:C09 


CH03MAH 


269 


378830 


RTA00002675F.e.07. 1 .P.Seq 


F 


M00039221 A:H03 


CH0°LNL 


270 


2173 1 


RTA00002709F.k.07. 1 .P.Seq 


F 


M000070I3A:D09 


CH02COH 


271 


428552 


RTA00002666F.C. 1 6. 1 .P.Seq 


F 


M00032541D:H08 


CHOSLNH 


272 


187632 


RTA00002664F.1. 1^.2. P. Seq 


F 


M0002~617B:C12 


CH04MAL 


273 


43 1053 


RTA00002668F.o.05.2.P.Seq 


F 


M00033 130B:F06 


CHOSLNH 


274 


188972 


RTA00002664F.d.20. 1 .P.Seq 


F 


M00027030C:H06 


CH04MAL 


275 


430678 


RTA0000 n 668F.h. 12. 1 .P.Seq 


F 


M00032994A:A08 


CHOSLNH 


L /O 


1 HA A .1 -> 


K I AUUUU_o / Ir .a.Uo. 1 .r.oeq 


c 
r 


Ivi000j86j 1 C.B 1 0 


CHO°LNL 


277 


24332 


RTA00002709Fj.07.l.P.Seq 


F 


M00006935C:F06 


CH02COH 


278 


376764 


RTA00002674F.f.20.1. P.Seq 


F 


M00O39 135D:F05 


CH0°LNL 


279 


13433S 


RTA00002662F.C. 15.2. P.Seq 


F 


M00006S97A:H02 


CH02COH 


280 


37554 1 


RTA00002680F.d.2 1 .2. P.Seq 


F 


M000397S3A.E03 


CHO°LNL 


281 


22890^ 


RTA00002664F.e.08.2.P.Seq 


F 


M00027085C:E1 1 


CH04MAL 


282 


58063 


RTA0000266 I F.h. 1 S. 1 .P.Seq 


F 


M00004234A:E07 


CHOI COH 



14 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


1 

1 LIBRARY 


283 


380500 


RTA00002670F.p. 1 9.2.P.Seq 


F 


M00033583B:E06 


CH09LNL 


284 


34928 


RTA000027 1 OF.p.2 1 . 1 .P.Seq 


F 


M00022795B:G06 


CH03MAH 


285 


374028 


RTA00002674F.k.03.2.P.Seq 


F 


M00039156A:B1 1 


CH09LNL 


286 


374121 


RTA00002672F.h.22.2.P.Seq 


F 


M00039O13A:C09 


; CH09LNL 


287 


429547 


RTA00002668F.C.07. 1 .P.Seq 


F 


M000329j7D:G09 


CH08LNH 


288 


380668 


RTA00002670F.p.I 1.2.P.Seq 


F 


M0003358IC:H10 


CH09LNL 


289 


258704 


RTA00002665F.m.06.1. P.Seq 


F 


M00032480B:E10 


CH08LNH 


290 


380325 


RTA00002670F.p^r2. P.Seq 


F 


M00033583D:B05 


CH09LNL 


291 


378326 


RTA00002681F.m.l 1.2.P.Seq 


F 


M00039396C:H01 


CH09LNL 


292 


375618 


RTA00002675F.d. 13. 1 .P.Seq 


F 


M000392 1SA:F03 


CH09LNL 


293 


20999 


RTA00002709FJ. 1 6. 1 .P.Seq 


F 


M00006977C:G04 


CH02COH 


294 


29102 


RTA000027 1 OF. p. 1 8. 1 .P.Seq 


F 


M00022793D:BOI 


CH03MAH 


295 


379334 


RTA00002680F.b.22. 1 .P.Seq 


F 


M00039778C:A04 


CH09LNL 


296 


23943 


RTA00002709F.L 12.1. P.Seq 


F 


M00006S86D:H02 


CH02COH 


297 


373998 


RTA00002672F.a. 10.2. P.Seq 


F 


M0003863 1 D.B02 


CH09LNL 


298 


373325 


RTA00002672F.c.l4.2.P.Seq 


F 


M00038662B:A12 


CH09LNL 1 


299 


373818 


RTA00002672F.e. 1 5.2.P.Seq 


F 


M00038995C:G08 


CH09LNL 


300 


429843 


RTA0000266SF.C. 1 0. 1 .P.Seq 


F 


M0003291SB:E06 


CH08LNH 


301 


427755 


RTA00002665F.d. 1 9.3. P.Seq 


F 


M000283 16B:H12 


CH08LNH 


302 


189177 


RTA00002664F.c.23.2.P.Seq 


F 


M00026922C:G03 


CH04MAL 


303 


1 3294 


RTA00002709FJ. 1 5. 1 .P.Seq 


F 


M00006968A.G08 


CH02COH 


304 


178801 


RTA00002663 F.n.0 1 . 1 .P.Seq 


F 


M00022997A:F06 


CH03MAH 


305 


230865 


RTA00002664F.d.03.2.P.Seq 


F 


M0002692SD:A03 


CH04MAL 


306 


178801 


RTA00002663 F.m.24. 1 .P.Seq 


F 


M000229O7A:F06 


CH03MAH 


307 


378809 


RTA00002672F.2.2 1 .2. P.Seq 


F 


M00039005C:H01 


CH09LNL 


308 


378957 


RTA00002670F.d. 1 7.2. P.Seq 


F 


M00033362C:C05 


CH09LNL 


309 


373523 


RTA00002674F.n.2 1 . 1 .P.Seq 


F [ 


M00039177B:D03 


CH09LNL 


310 


375458 


RTA00002678F. 1.06.2. P.Seq 


F 


M0003961 1 D:D1 1 


CH09LNL ! 


311 


429794 


RTA00002668F.C.09. 1 .P.Seq 


F 


M000329I8B:D08 


CH08LNH 


312 


72797 


RTA0000266 I F.e.07. 1 .P.Seq 


F 


M0000376IC:F02 


CHOICOH 


313 


429992 


RTA00002668F.C.2 1 . 1 .P.Seq 


F 


M0003292IB:H08 


CH08LNH 


314 


374410 


RTA00002674F.k.l 1 .2. P.Seq 


F 


M00039I58B:G12 


CH09LNL 


315 


376553 


RTA00002674F.g. 1 9. 1 .P.Seq 


F 


M00039139A:C09 


CH09LNL 


316 


429096 


RTA00002666F.f. 1 6. 1 .P.Seq 


F 


M00032578A:G06 


CH08LNH 


317 


181948 


RTA00002663 F.n.05. 1 .P.Seq 


F 


M00023003C:D07 


CH03MAH 


318 


378475 


RTA00002672F.h.0 1 .2. P.Seq 


F 


M00039006D:B0I 


CH09LNL 


319 


427336 


RTA00002665F.C.23. 1 .P.Seq 


F 


M000282 10B:D02 


CHOSLNH 


320 


374042 


RTA00002672F.a.08.2.P.Seq 


F 


M0003S63 IC:BI0 


CH09LNL 


321 


386543 


RTA00002672F.f. 13.2. P.Seq 


F 


M00038^99B:G11 


CH09LNL 


322 


376659 


RTA00002678F.h. 1 1 .2. P.Seq 


F 


M0003*5475C:E10 


CH09LNL 


323 


29135 


RTA00002663F.C.09.1. P.Seq 


F 


M00021*23C:DI1 


CH03MAH 


324. 


377967 


RTA0000268 1 F.m. 1 7.2. P.Seq 


F 


M00039S«7D:C10 


CH09LNL 


325 


431330 


RTA00002668F.m. 1 6. 2. P.Seq 


F 


M000330"4A:C08 


CHOSLNH 


326 


373824 


RTA000026SOF.i. 19.2. P.Seq 


F 


M00039S0SD:H02 


CH09LNL 


327 


50094 


RTA0000266 1 F.j.02.2.P.Seq 


F 


M000043'8A:B10 


CHOICOH 


328 


214272 


KTA00002664F.h.03.2. P.Seq 


F 


M00027366A:F1I j 


CH04MAL 


329 


377293 


RTA00002680F.b.!7.2.P.Seq 


F 


M0003 C >"7C:E05 


CH09LNL 



IS 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


330 


195053 


RTA00002663F.n. 16. l.P.Seq 


F 


M00023044B:D02 


CH03iVIAH 


331 


21274 


RTA00002709F.rn.09. 1 .P.Seq 


F 


M00007I94A:B09 


CH02COH 


332 


376580 


RTA00002675F.b. 20. l.P.Seq 


F 


M00039212C:C12 


CH09LNL 


333 


374725 


RTA00002673F.f.02.2. P.Seq 


F 


M00039070D:C02 


CH09LNL 


334 


25238 


RTA000027 I OF.n.08. 1 .P.Seq 


F 


M00022634D:C08 


CH03MAH 


335 


377337 


RTA00002683 F.l.07.2. P.Seq 


F 


M00040085D:A10 


CH09LNL 


336 


450485 


RTA00002692F.a. 1 3. 2. P.Seq 


F 


M00042625C:B04 


CHI SCON 


337 


21989 


RTA00002709F.H.22. 1 .P.Seq 


F 


M00006861B:FO9 


CH02COH 


338 


58296 


RTA0000266 1 F.i.20.2. P.Seq 


F 


M00004354D:E05 


CHOICOH 


339 


379144 


RTA00002679F.1. 14. 1 .P.Seq 


F 


M00039705D:F02 


CH09LNL 


340 


379690 


RTA00002680F.b.2 1 .2. P.Seq 


F 


M00039778B:GO3 


CH09LNL 


341 


379640 


RTA0000268 1 F.d. 1 2.2. P.Seq 


F 


M00039859C:G10 


CH09LNL 


342 


373988 


RTA00002673F.h.23. 1 .P.Seq 


F 


M00039079A:A05 


CH09LNL 


343 


373988 


RTA00002673F.h.23.2.P.Seq 


F 


M00039079A:A05 


CH09LNL 


344 


380673 


RTA00002673 F.j. 1 3 .2. P.Seq 


F 


M00039084C:H03 


CH09LNL 


345 


55243 


RTA00002661 F.i.06.2. P.Seq 


F 


M00004282D:C1 I 


CHOICOH 


346 


40557 


RTA000027 1 3 F.h.2 1 . 1 . P.Seq 


F 


M00027398C.F07 


CH04MAL 


347 


375467 


RTA00002677F.rn.03. 1 .P.Seq 


F 


M000394I7A:D03 


CH09LNL 


348 


398406 


RTA00002679FJ.02. 1 .P.Seq 


F 


M0003 t: >6S9C:E08 


CH09LNL 


349 


430392 


RTA00002668F.k. 1 9. 1 .P.Seq 


F 


M00033037D:C1 I 


CH08LNH 


350 


376746 


RTA00002674F.f. 12. 1 .P.Seq 


F 


.M00039133B:F08 


CH09LNL 


351 


1 15595 


RTA000027 1 3 F.e.07. 1 .P.Seq 


F 


M00027297A:C04 


CH04MAL 


352 


377182 


RTA00002682F.L1 1 .l.P.Seq 


F 


M00040010A:F!0 


CH09LNL 


353 


380659 


RTA00002634F.e. 07.2. P.Seq 


F 


M00040I24D-.HOI 


CH09LNL 


354 


373862 


RTA0000267 1 F.g.O 1 . 1 .P.Seq 


F 


M00038284B:H04 


CH09LNL 


355 


376096 


RTA00002677F.b. 16.2. P.Seq 


F 


M00039340A:D05 


CH09LNL 


356 


372887 


RTA00002670F.d.05.2. P.Seq 


F 


M00033358A:H12 


CH09LNL 


357 


378475 


RTA00002672F.g.24.2.P.Seq 


F 


M00039006D:B01 


CH09LNL 


358 


427336 


RTA00002665F.C.23.3. P.Seq 


F 


M0002S2IOB:D02 


CH08LNH 


359 


373814 


RTA00002672F.b.02.2. P.Seq 


F 


M00038635A:G09 


CH09LNL 


360 


215506 


RTA00002664F.h.08.2. P.Seq 


F 


M00027438C:G07 


CH04MAL 


361 


374465 


RTA00002673F.C. 07.2. P.Seq 


F 


M00039058C:H02 


CH09LNL 


362 


428784 


RTA00002667F.C. 13. 1 P.Seq 


F 


M00032744B:F10 


CH08LNH 


363 


379581 


RTA00002676F.a.2 1 .2. P.Seq 


F 


M00039273B:F02 


CH09LNL 


364 


378371 


RTA00002678F.f.20.2. P.Seq 


F 


M00039465A:A08 


CH09LNL 


365 


375154 


RTA00002676F.C.! 3.2. P.Seq 


F 


M00039279B:H02 


CH09LNL 


366 


431214 


RTA00002669F.k.04. 1 .P.Seq 


F 


M000332t>2D:Al I 


CH08LNH 


367 


376053 


RTA00002675F. 1.03. l.P.Seq 


F 


M00039249A:C12 


CH09LNL 


368 


373282 


RTA00002680F.j. 19.2. P.Seq 


F 


M00039SI3B:D1 I 


CH09LNL ! 


369 


33397 


RTA0000266 1 F.h.04. 1 .P.Seq 


F 


M00004 1 68 A :G1 1 


CHOICOH 


370 


376706 


RTA00002675F.C.02. 1 .P.Seq 


F 


M00039213B:F05 


CH09LNL 


371 


378292 


RTA0000268 1 F.i.09.2. P.Seq 


F 


M00039SS0A:H1 1 


CH09LNL 


372 


431612 


RTA00002669F.e.23.3. P.Seq 


F 


M00033202D:G06 


CH08LNH 


373 


378471 


RTA00002679F.0. 1 7. 1 .P.Seq 


F 


M0003972~C:B09 


CH09LNL 


374 


378666 


RTA000026S 1 F.i.05.2.P.Seq 


F 


M00039879C:F05 


CH09LNL 


375 


374894 


RTA00002675F.f.04. l.P.Seq 


F 


M00039224A:E12 


CH09LNL 


376 


430191 


RTA00002667F j.24. 1 .P Seq 


F 


M00032829B:E06 


CH08LNH 



He 



WO 01/02568 



PCT/US00/18374 



fpl 
1U 


CLUSTER 




ORIENTATION 


f CLONE ID 


LIBRARY 


j7 / 


428581 


RTA0000266 / F.c. 1 2. 1 .P.Seq 


F 


M00032739A:A06 


CH08LNH 


j7S 


379598 


RTA000026 /9F.k.0j>. 1 .P.Seq 


F 


M000396973:Fl I 


CH09LNL 


_>79 


45300 


RTA000027 10F.j.2j. 1. P.Seq 


F 


M00022434D:D06 


CH03MAH 


380 


23030 


RTA00002709F.b. 10. 1 .P.Seq 


F 


M00005384A:CI l 


CH02COH 


38 1 


379928 


RTA000026 /9F.o. 06. 1. P.Seq 


F 


M00039720D:D02 


CH09LNL 


j 82 


430191 


RTA0OO02667F.k.0 1 . 1 .P.Seq 


F 


M00032829B:E06 


CH08LNH 


j8j 


374684 


RTA000026 / DF.g.02. 1 .P.Seq 


F 


M00039228A:B05 


CH09LNL 


384 


375728 


RTA00002676F.h.0D.2.P.Seq 


F 


M00039299B:GI2 


CH09LNL 


385 


230237 


RTA000026 /OF. b. 08. 2. P.Seq 


F 


M00033306D:H09 


CH09LNL 


386 


380673 


RTA00002673FJ. 13. 1. P.Seq 


F 


M00039084C:H03 


CH09LNL 


387 


378938 


RTA00002679F.k.20. 1 .P.Seq 


F 


M00039702A:Bl2 


CH09LNL 


383 


3751 15 


RTA0000267j> F.e.0 1 . 1 .P.Seq 


F 


M00039066D:G08 


CH09LNL 


_>89 


378673 


RTA00002680F.p.2 1 .2. P.Seq 


F 


M00039833A:F05 


CH09LNL 


390 


372909 


RTA00002670F.a. 12.2 ? P.Seq 


F 


M00033300D:Hl2 


CH09LNL 


391 


373300 


RTA00002674F.C.2 1 . 1 .P.Seq 


F 


M00039l26D:A08 


CH09LNL 


392 


379318 


RTA000026S3F.h. 16. 2. P.Seq 


F 


M00040071 B:A10 


CH09LNL 


393 


378319 


RTA0000268 I F.k.07.2.P.Seq 


F 


M0003Q890A:H05 


CH09LNL 


394 


374608 


RTA00002675F.g.20. 1. P.Seq 


F 


M00039230A:A10 


CH09LNL 


395 


374328 


RTA00002673F.c.24.2.P.Seq 


F 


M0003906lB:F08 


CH09LNL 


396 


374328 


RTA00002673F.d.0 1 .2. P.Seq 


F 


M0003906! B:F08 


CH09LNL 


397 


428401 


RTA00002667F.b.07. 1 .P.Seq 


F 


M00032725C:F06 


CH0SLNH 


398 


136202 


RTA00002687F.p.05.2.P.Seq 


F 


M000-i0349D:B09 


CH14EDT 


399 


374394 


RTA00002673 F.c. 1 5. 1 .P.Seq 


F 


M00039059C:G08 


CH09LNL 


400 


37784 


RTA00002708F.C. 1 7. 1 .P.Seq 


F 


M000038l6D:El I 


CHOICOH 


401 


378282 


RTA0000268 1 F.h. 11 . 1 .P.Seq 


F 


M00039876D:H09 


CHO^LNL 


402 


185663 


RTA000027 1 2F.p. 1 7.2. P.Seq 


F 


M00027l78B:G09 


CH04MAL 


403 


14866 


RTA00002709F.d. 1 4. 1 .P.Seq 


F 


M00005623DGI2 


CH02COH 


404 


383502 


RTA00002670F.k.07.2.P.Seq 


F 


M00033446D:B02 


CH09LNL 


405 


13463 


RTA00002709F.f. 1 3. 1 .P.Seq 


F 


M00006657C:GO5 


CH02COH 


406 


21274 


RTA00002709F.m. 09.2. P.Seq 


F 


M00007i94A:B09 


CH02COH 


407 


13745 


RTA000027 14F.b. 13.1 .P.Seq 


F 


M00027S01CC1 I 


CH04MAL 


408 


23485 


RTA00002 7 1 4F.C. 1 0. 1 .P.Seq 


F 


M0002 7 836D:Fl2 


CH04MAL 


409 


428364 


RTA0000266 /F.c. 09. 1 .P.Seq 


F 


M00032737B:E09 


CH08LNH 


410 


43 1629 


RTA00002669F.1. 14. 2. P.Seq 


F 


M00033276B:G08 


CH0SLNH 


41 1 


379754 


RTA0000_682F.h.08. 1 .P.Seq 


F 


M0003^983D:A06 


CH09LNL 


412 


43 1601 


RTA00002669F.k.08.2.P.Seq 


F 


M00033263B:G04 


CH08LNH 


413 


375749 


RTA00002680F.t.2_v2. P.Seq 


F 


M0003 C) 795D:G06 


CHO^LNL 


414 


378764 


RTAOOOO^bS I F.j. 04. 2. P.Seq 


F 


M0003^884A:Hl I 


CH09LNL 


tic 
413 


2 1 5605 


RTA00002664F.1.20. 1 .P.Seq 


F 


M0002"647C:D03 


CH04MAL 


416 


376144 


RTA0000_6 oF.j.09. 1 .P.Seq 


F 


M0003°:4IA:E I I 


CH09LNL 


J 1 "7 
4 I / 


j / jU / 1 


K I AUUUU—O / Ur J — ? — r .ieq 


c 
r 


iv1UUUjj , 44_A . UOo 


L H09LN L 


418 


379684 


RTA0000263 I F.c.09.2. P.Seq 


F 


M0003^85lB:Gl I 


CH09LNL 


419 


379610 


RTA00002680F.k.t 1. 2. P.Seq 


F 


M0003^3l5C:F09 


CH09LNL 


420 


22392 


RTA00002"08F.a. 1 0. 1 .P.Seq 


F 


MOO00!3^5D:HO2 


CHOICOH 


421 


377555 


RTA00002633F.1.08.2. P.Seq 


F 


M0004Q085D:E04 


CH09LNL 


422 


32624 


RT A 00002 7 I 3F.f. I 5. 1 .P.Seq 


F 


M000:~347C:GQ- 


CH04MAL 


423 


375024 


RTA000026^5F.p.l2.l.P.Seq 


F 


M0003°266DF!2 


CH0°LNL 



1? 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE t'D 


LIBRARY 


424 


374725 


RTAOO0O2673F.f.02. LP.Seq 


F 


M00039070D:C02 


CH09LNL 


425 


376228 


RTA00002676F.f.l9.2.P.Seq 


F 


M00039293A:H04 


CH09LNL 


426 


375906 


RTA00002675F.i.l8.I.P.Seq 


F 


M0OO39238D:A08 


CH09LNL 


427 


186190 


RTA0O0027 1 4F.a.04. LP.Seq 


F 


M00027729D:H06 


CH04MAL 


428 


57694 


RTA000027 1 3 F.f.02. 1 .P.Seq 


F 


M00027319D:B1 i 


CH04MAL 


429 


7007 


RTA00002709F.d.08. i .P.Seq 


F 


M00005614B:B0I 


CH02COH 


430 


400084 


RTA00002685F.0. 19.2. P.Seq 


F 


M00039641C:D07 


CH12EDT 


431 


375648 


RTA00002676F.h. 1 8.2. P.Seq 


F 


M00039301B:F06 


CH09LNL 


432 


166493 


RTA00002663F.h.08. 1 .P.Seq 


F 


M00022492C:A02 


CH03MAH 


433 


379632 


RTA00002682F.h. 14. 1 .P.Seq 


F 


M00039984B:G12 


CH09LNL 


434 


373234 


RTA00002676F.g. 1 5.2.P.Seq 


F 


M00039297C:H08 


CH09LNL 


435 


401230 


RTA00002685F.L05.2. P.Seq 


F 


M00039533A:CI2 


CH12EDT 


436 


186623 


RTA000027 1 2F.f. 1 5. 1 .P.Seq 


F 


M00026843B:DtO 


! CH04MAL 


437 


127714 


RTA00002712F.k. 14. 1 .P.Seq 


F 


M00027013A:C09 


CH04MAL 


438 


451857 


RTA00002692 F.a.O 1 . 1 .P.Seq 


F 


M00042584B:C10 


CH 1 SCON 


439 


404620 


RTA00002687F.c.03.2.P.Seq 


F 


M00039770A:G1 i 


CH14EDT 


440 


186872 


RTA00002663 F.k.23 . 1 .P.Seq 


F 


M00022797B:G08 


CH03MAH 


441 


42729 


RTA00002709F.c.06.2.P.Seq 


F 


M00005458A:F1 I 


CH02COH 


442 


373380 


RTA00002674F.b.07. 1 .P.Seq 


F 


M000J9I23A:B10 


CH09LNL 


443 


374465 


RTA00002673F.C.07. LP.Seq 


F 


M00039053C:H02 


CH09LNL 


444 


403557 


RTA00002687F.d.l0.2.P.Seq 


F 


M00039948A:E03 


CH14EDT 


445 


16749 


RTA00002709F.b. 14. 2. P.Seq 


F 


M00005402B:F08 


CH02COH 


446 


375592 


RTA0000^680F.f.^.2.P.Seq 


F 


M000397Q5D:E10 


CH09LNL 


447 


376103 


RTA00002676F.g.06.2. P.Seq 


F 


M000392953:D03 


CH09LNL 


448 


40228 


RTA00002712F.1. 18.1. P.Seq 


F 


M00O27049B:F05 


CH04MAL 


449 


374606 


RTA00002673F.J.23. LP.Seq 


F 


M000J^6A:A05 


CH09LNL 


450 


378270 


RTA000026SOF.h.08.2.P.Seq 


F 


M0003^S0l A:Hll 


CH09LNL 


451 


236321 


RTA00002668F.k. 14. 1 .P.Seq 


F 


M00033034CF02 


CH08LNH 


452 


378676 


RTA000026SOF.m.20.2.P.Seq 


F 


M0003^S27B:F07 


CH09LNL 


453 


373252 


RTA00002670F.k. 16.2. P.Seq 


F 


M0003345 1A:H0l 


CH09LNL 


454 


384601 


RTA00002670F.k.06.2.P.Seq 


F 


M0005 3446C:G08 


CH09LNL 


455 


403772 


RTA00002687F.a.03.2.P.Seq 


F 


M00039746C:G09 


CH14EDT 


456 


379566 


RTA000026S3F.k.04.1. P.Seq 


F 


M00040081C:E0! 


CH09LNL 


457 


136202 


RTA000026S7F.p.05. 1 .P.Seq 


F 


M00040349D:B09 


CH14EDT 


458 


14317 


RTA000027 i 3 F.c. 13.1 .P.Seq 


F 


M00027248A:C02 


CH04MAL 


459 


375349 


RTA00002672FJ.1 1. 1. P.Seq 


F 


M0003^024B:BI0 


CH09LNL 


460 


403020 


RTA00002637F.a.02.2.P.Seq 


F 


MO00.^r46C:A08 


CH14EDT 


461 


374060 


RTA00002672F.L07. 1 .P.Seq 


F 


M0003°014B:C04 


CH09LNL 


462 


183399 


RTA000027 I2F.O.10. ! .P.Seq 


F 


M000:7136C:C09 


CH04MAL 


463 


373789 


RTA0000267 1 F.c.20. 1 .P.Seq 


F 


M0003S259B:G08 


CH09LNL 


464 


20168 


RTA000027 1 1 F.b.22. LP.Seq 


F 


M00022S34B:GI 1 


CH03MAH 


465 


452641 


RTA0O002692F.d.05.2.P.Seq 


F 


MOOO-L>003C:D08 


CHI SCON 


466 


431370 


RTA0000266^F.m.04.:. P.Seq 


F 


M000:32SSB:D12 


CH08LNH 


467 


153044 


RTA000027 1 3FJ.03. LP.Seq 


F 


M000:^4"6A:C09 


CH04MAL 


468 


378229 


RTAOO0026"79F.c.I6.2.P.Seq 


F 


M000_^bo3C:G09 


CH09LNL 


469 


37432S 


RTA00002673F.d.0L LP.Seq 


F 


M000;^0oIB:F08 


CH09LNL 


470 


39606 


RTA000027 1 3 F.i.20, LP.Seq 


F 


M000:"4oSA:C09 


CH04MAL 



1% 



WO 01/02568 



PCT/US00/18374 



SEC 


> 










ID 

471 


CLUSTER 
59077 


SEQ NAME 
RTA000027l3F.n.0l.l.P.Se< 


ORJENTATIO 


N CLONE ID 


LIBRARY 


4 /i 


1935 


RTA00002710F.b.I Ll.P.Sec 


q F 
1 F 


M00027596C:E06 
M00008006B:B03 


CH04MAL 
CH03MAH 


/I "7** 

47.? 


379684 


RTA0000268 1 F.c.09. 1. P. Sec 


1 F 


M00039851B:G1 1 


CH09LNL 


474 


45 1 564 


RTA0000269 1 F.f. 1 2.2.P.Sec 


1 F 


iM0004341 ID:H06 


CH 17COHLV 


4 / D 
476 


7571 
129323 


RTA000027 lOF.a. I 5. 1 .P.Sec 
RTA000027!3F.k.21.!.P.Sec 


1 F 


M00007943D:C09 


CHQ3MAH 


477 


12960 


RTA000027I0F.a.23.1.P.Seq 


1 F 
F 


M00027525B:D06 
M00007976A:C10 


CH04MAL 
CH03MAH 


478 


186730 


RTA000027l3F.o.05.t.P.Seq 


F 


M00027641C.A03 


CH04MAL 


479 


59077 


RTA00002713F.m.24.1. P.Sec 


\ F 


M00027596C:E06 


CH04MAL 


480 


185884 


RTA000027 1 2F.b.06. 1 .P.Seq 


F 


M000233I6C:G08 


CH04MAL 


48 I 


19471 


RTA00002708F.g.08. 1 .P.Seq 


F 


M00004I97B:H10 


CH01COH 


482 


45206 


RTA000027 I OF.c.06. 1 .P.Seq 


F 


M00008063B:A06 


CH03MAH 


/I o 


404257 


RTA00002687F.g.06.2. P.Seq 


F 


M00040208A.C03 


CHI4EDT 


/lot 

484 


372997 


RTA00O02679F.p.O4. 1 .P.Seq 


F 


iM00039729A:AI0 


CH09LNL 


/IOC 

485 


43792 


RTA000027 1 3F.k. 1 6. 1 .P.Seq 


F 


M00027520A:C05 


CH04MAL 


486 


400052 


RTA00002687F.h. 13.2. P.Seq 


F 


M00040291D:C05 


CHI4EDT 


,4 0*? 

487 


452194 


RTA00002692F.C. 14.2. P.Seq 


F 


M00042983A:F06 


CH i SCON 


488 


24034 


RTA00002710F.b.06.1. P.Seq 


F 


M00007992C:F06 


CH03MAH 


489 


447544 


RTA000026S9F.e. 1 8. 1 .P.Seq 


F 


M00042905D:D02 


CH15CON 


490 


401872 


RTA00002686F.C.23. 1 .P.Seq 


F 


M00040I4!D:F05 


CH13EDT 


49 1 


376553 


RTA00002674F.g. 1 9.2. P.Seq 


F 


M00039139A:C09 


CH09LNL 


492 


455051 


RTA00002694F.a.07. i .P.Seq 


F 


M00042595A.A1 I 


CH20COHL V 


493 


16760 


RTA00002708FJ.03. 1 .P.Seq 


F 


M00004393B:E07 


CH01COH 


494 


374174 


RTA00002672F.i. 1 2.2. P.Seq 


F 


M000390I5A:D07 


CH09LNL 


49!) 


374283 


RTA00002672F.k.2 1 .2. P.Seq 


F 


M00039030B.E02 


CH09LNL 


496 


375772 


RTA0000268 I F.o.24. 1 .P.Seq 


F 


M00039909C.G05 


CH09LNL 


497 


376417 


RTA00002678F.i.03.2.P.Seq 


F 


M00039477D:A10 


CH09LNL 


498 


423971 


RTA00002666F.O.02. 1 .P.Seq 


F 


M00032673C:D06 


CH08LNH 


499 


394098 


RTA0000268 1 F.j. 15. 1 .P.Seq 


F 


M00039887C:E07 


CH09LNL 


500 


379761 


RTA00002670F.n.03. 1 .P.Seq 


F 


M0003356 1C:A02 


CH09LNL 


501 


374266 


RTA00002674F.I.08. 1 .P.Seq 


F 


M00039144CE06 


CH09LNL 




372946 


RTA00002670F.1.07. 1 .P.Seq 


F 


M00033457D:A05 


CH09LNL 


jOj 


228909 


RTA00002664F.e.08. 1 .P.Seq 


F 


M00027085C.E1 1 


CH04MAL 


304 


427524 


RTA00002665F.e.05. i .P.Seq 


F 


M00028354D:A03 


CH08LNH 




380413 


RTA00002680F.k.I9.2.P.Seq 


F 


M000398I6C:D05 


CH09LNL 


j(Jo 


373866 


RTA0000267IF.c.24.2.P.Seq 


F 


M00038259C:H09 


CH09LNL 


j(J7 


427202 


RTA00002665F.2. 15.1 .P.Seq 


F 


M00028617C:AI2 


CH08LNH 


3Uo 


373000 


RTA00002670F.J. 13. 1 .P.Seq 


F 


M00033437C:C03 


CH09LNL 


309 


378838 


RTA00002678F.p.l 1. 2. P.Seq 


F 


M00039637C:A10 


CH09LNL 


3 I 0 


24945 


RTA000027 1 0F.p.05. 1 .P.Seq 


F 


M00022739A:B03 


CH03MAH 


511 


20277 


RT A0000^7 1 OF e 17 1 P 


r- 

r 


M00021972D:C1 1 


CH03MAH 


512 


20820 


RTA000027IOF.e.02. i. P.Seq 


F 


M000:i9IOC:A10 


CH03MAH 


513 


376791 


RTA00002674F.1. 1 7.2. P.Seq 


F 


M00039166B:G06 


CH09LNL 


514 


9809 


KTA000027!OF.g.I2. ! P.Seq 


F 


MOOO:217SB:D06 


CH03MAH 


515 


429562 F 


lTA00002667F.m.03.1. P.Seq 


F 


M00032853D:G12 


CH08LNH 


516 


12920 1 


*TA000027lOF.e.I5.l.P.Seq 


F 


M0002I964C.E10 


CH03MAH 


517 


377565 i 


ITA00002684F.H. 19.1. P.Seq 


F 


M000-IQ309A:E1 I 


CH09LNL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLLJSTFR 

V^^U J 1 C IX 


SEO NAMF 


vj rxi tiN 1 a i i\Jis 




T inn » r-> \y 

LlbKARY 


S 1 K 
J 1 o 




fx I rtUUUu-OOOr.U I .r.oeq 


r 


V^AAA^AA**"* \ . f~~^ 1 C\ 

iV1000j29j j A:C 1 0 


CH08LNH 


S 1 Q 

J 1 7 


HZ / O JH 


R T A OftOfP aa > F f 00 1 P x^n 


r 


\><AAAAW^IA r\ .CrtO 


CH08LNH 


S7n 


£1977 1 7 
4Z / / I J 


rx l auuuu_oo j r ,e — ? . i .r.jcC]. 


r 
r 


\^fAAAAC*ii ir.TAO 

MOOOzojohC :G0o 


CH08LNH 


S7 1 


171AH7 
J / JOU / 


dt a nnno7 a74F h i s 7 p 


r 


K,f Ann "AIT'A.C in 

ivlUUO j 9 1 z / O : h 1 U 


CH09LNL 


52~> 
^- — 


17R7R 1 


RT A 00007 67.1 F n 1 4 I P "s^n 


r 


IvlUUUjy 1 VOD. HUo 


CH09LNL 


S71 


*+zyjo 1 


l\ 1 MUUUU-OOOr.u. 1 l . 1 .r. JCCj 


c 
r 


» /AAn ^ a c c a rx .rm 

MUU0jzjjQD:C0z 


CH08LNH 


S74 


1 7 A7^a 


dt a nnoo^n^i f ^ u i p 


c 
r 


\/4AAAAOA^1C \ . UAA 

MU0UUoU4j A.HOz 


CHOjMAH 


Si S 

J ' 




dta nnon^ASF k in i p 

t\. i nUiJl/U-OOj r a. i u. l . r. jet] 


p 
r 


\ aa a 1 1 1 "rrnn 
ivlUUUjl4l /L.CjUy 


CH08LNH 


S7A 


I OOOJ 


dt a nrinn~>7noF A i s t p q^/i 


r 


1^/fAAAAC^AC * ,/-A1 

M0000J62:: A:C02 


CH02COH 


S77 

JZ / 


77Q7A J 
J / V / O I 


K l auijuuzo / ur .n.u j .1. r.oeq 


r 


MOOOj j jo I C : AOz 


CH09LNL 


Off 

JZO 


404U / 


rx I AUUUU-OOJ r.C. ! U. J . r . ocq 


r 


X^AAAAO 1 AZ! HS . \ A "* 

M00028 196D: A0j 


CH08LNH 


S7Q 


7 1 7AS 
Z I JOJ 


RTAnnnn77nQF k Of, i p <s^r? 


c 
r 


\.fAAAA"7A 1 A P\. [_JAO 

ivlUUUU /U I ZD: HUo 


CHUiLUH 


sin 


*+Z / hoo 


rta nnnn^^As f h m i p 


c 
r 


\ A A A A A O t O 1 rx . 1 A 

IV1UU02S I o4D:U 10 


CHOoLNH 


SI 1 

J J 1 


4UUZ0J 


r t a nnnn~>ARs f r ni "> p 


c 

r 


\,1AAATl' , "7 1 D.DA"7 

IV1U0Uj9j74B. BU7 


/-~- m rx~r 

CM l 2hDT 


j jz 


J5UU 30 


dta nonn^ f a ia ~> p 

tx 1 AUUUU-OaUr.a. 1 o._. r.ocq 


r 


H^AAA" 1 . f7 1 I 

iv1000j9 / / j D:r I I 


T I A A r Vff 

CH09LNL 


S17 
3 J J 


J / J jZ4 


K 1 AUUUU-0 / or A. 1 Z.Z.r . j>«?q 


r 


MOOOj 96 1 23: B 10 


r t Ant v it 

CH09LNL 




Zj I OJ 


d t* \ Annn - ^ t 1 n f l- 17 1 d c^»^ 


r 
r 


M000224963:E 12 


CHOjMAH 


SIS 






r 


MOOO j9d29L : DO / 


CH 1 2EDT 


J JO 


a,i a ao 


K I AUUUU-Oo 1 r .J. 1 J.z.r.oeq 


c 
r 


MOOOj 988 /C:E0/ 


CH09LNL 


S7.7 
JJ / 


i n a ~* a 
1 74 j0 


K 1 AUUUU- i i Ur . 1. 1 1.1 .r.beq 


c 

r 


M00022j6jD: A0j 


CHOjMAH 


N 7 0 
JJO 


j / jozU 


K 1 AUUUU-O 1 4r .u.Uo, 1 . r .bcq 


r 


M000j912 / A:G 1 I 


CH09LNL 


J J V 


"7 "7 n - in 

j /oj4o 


K 1 AUUUU_0 / - r .g. 1 4 — r.oeq 


r 


M000j9004 3:C 1 I 


y^" i iaai \ i r 

CH09LNL 




Alio iy 


DT \ Annn^AA 1 C f I Q "» D C^^, 

K l AuUUU_oo4r .r. 1 0 — r.oeq 


c 
r 


\ (AAA^"no rx \ A 1 

ivlOOO- / 228 D: AO I 


CH04MA L 


J4 1 


J / Oo /4 


dt \nrin/PA7nc ^ 7 d c 
K 1 AUUUU_0 / Ur .e — 3 .z. r .oeq 


D 

r 


M 0 00 j j j 7 j A : U 04 


CH09l\L 


SJ.7 
J*+Z 


1 1 jZV 


dt AnnnnoTfiQF k oq i d 

K ( AUUUU^ / UVr.D.Uo. 1 .r.oCq 


r 


MOOOOj j79A: c04 


CH0ZCOH 


SJ.1 


I I WUj 


d t a nnno~> i \ of n i ^ i d e Jn 
tx 1 aUUUU- / lUr.p. 1 J . 1 . r .ocq 


c 
r 


MUU0__75j(_ :O06 


/~" LI A \ ,1 A LI 

L HUjMAH 




177fl7Q 

j / /uzo 


rt a nnnn7A7xF n 7 1 7 p 

rx 1 /AUvUUZO / or.n.z 1 — r.ocq 


r 
r 


X.fAAA^A^ - ^ 1 \ • (~~" 1 A 

iVlUUU JVOJ 1 A - C 1 U 


' LT A A T XII 


SJ.S 

J 4 * J 


7 77 < ] 
J /JJJ 1 


rx 1 .-\UUUVJ_0/ 1 r .1. 1 o.j. r.ocq 


v 
r 


iViUUUjojZ / D.AUj 


/^UMAI XiT 


X/1A 

J40 


J /OUoZ 


DTA C\C\(\C\~> &~ IF m 17 I D Q^,-» 
K 1 AUUUU-0 / 4r.m. 1 / .I. r.ocq 


r 


\ * r\r\c\ "* A 1 1 1 O . T~"\ 1 I 

MOOOj 9 1 7 1 3: D I I 


J r A A I Xfl 

LHU9LNL 


S/A7 


J /OVo / 


dt \nnnn^^7QP a 7 1 7 d c 
K 1 AUUUU-O / or g.z 1 .1. r.ocq 


r 


\fAA/A" , ni^A/'~' OAO 

MOOOj 94 /ZL :B0o 


If AA f XII 


j4o 


0 I yz I 


K I AUUUU-00 1 r .g.Uo. 1 .r.ocq 


r 
r 


\ IAAAA ^ AH C ~ 5 ' — A "* 

M0000j99j 3: nOj 


CH0 l COH 




J /J400 


d t a nnnrp />77 f h ni 7 p e -» n 


c 
r 


\ f A A A ^ O^^'CO TAO 

ivlUuujooj J to Uo 


AUAQI xrr 


ssn 


JOUJ J J 


RTAnnnfpA7nF 0 da 7 p c-,-. 
rx 1 aUUI/u_o / ur.o.uo — r.ocq 


r 

r 


MUUUjjJ/UCU 10 


/— TJ A A r x f T 

LHUV LiNL 


SS 1 




RTAnnnn^^A7F h ij. 1 p <^^r» 
tx 1 aUUUv_oo / r.n. i *t. 1 .r.ocq 


r 
r 


X.fAAA^OOAOD.r 1 A 

MUUOj-oUo d.O 1 0 


AUAOI XI LI 

LHOoLNH 


SS7 


"5 7Q77 1 

J / 7 11 1 


rt a onnn^A^" 5 f n n 1 1 p 

tx 1 nuuuu-Oo-r.n.u i . i .r.ocq 


r 


)V ,f AAA 1 A A 1 7r\-i'^A' 1 


Ajjnnr xf I 


S S 1 

JJJ 


171 S 17 
J / J JJZ 


dta nnorPA7~> f h i n 7 p «;^ n 

tx 1 auwu-u / — r .u. 1 u. — . r . ocq 


C 

r 


>/fAAA''QQQl \ rXA ? 

ivIUUU j 3VV 1 A. DU 1 


riJAQI XfT 

LHUyLM L 


SS4 


j 7563 j 


RTAnnnn^A77F m OS 7 p e Pn 
(\ 1 auuuw_u / /r.iii.wj — r.ocq 


V 

r 


tVl U U U j y4 I / b.MII 


AUAQ! Xn 

L HUV L;N L 


s s s 


17^1 SA 
J / O J JO 


RT AOOnO^AQ T F f 07 I P 

rx 1 auuuu-uo 1 r . i ,\j i . 1 .r.ocq 


r 


ivIUUU j y oOO b.AUo 
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373477 


RTA00002672F.b.23.l. P.Seq 
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CH09LNL 
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RTA000027 1 OF.g.02. 1 .P.Seq 
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! CH03MAH 
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RTA00002709F.L09. i .P.Seq 
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CH02COH 
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RTA00002672F.h. 1 5. 2. P.Seq 


F 
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RTA00002684F.a.03 .2. P.Seq 
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664 
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RTA00002686F.g.l 1.1. P.Seq 


F 


M00040181B:H09 


CH13EDT 


665 


428064 


RTA00002665F.L04.I.P.Seq 


F 


M0003I485D:G02 


CH08LNH 


666 


23310 


RTA00002708F.e. 1 0. 1 .P.Seq 


F 


M00004046C:A08 


CH01COH 
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376233 


RTA00002677F.b. 1 5.2.P.Seq 


F 


M00039339CF03 


CH09LNL 
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375848 


RTA00002674F.m.03. 1 .P.Seq 


F 


M00039I68C.A04 


CH09LNL 


669 


24225 1 


RTA00002665F.i.08. 1 .P.Seq 


F 


MO0028772C.B09 


CH08LNH 


670 


374064 


RTA00002672F.f. 1 5.2. P.Seq 


F 


M00038999D:Cll 


CH09LNL 


671 


146260 


RTA00002663F.d. 1 7. 1 .P.Seq 


F 


M00022099B:D06 


CH03MAH 


672 


375575 


RTA00002677F.e.22.-\ P.Seq 


F 


M00039385B:E09 


CH09LNL 


673 


355518 


RTA00002665F.c.I5.3.P.Seq 


F 


M0002S20!B:HI2 


CH08LNH 


674 


184223 


RTA00002662F.b.08.2. P.Seq 


F 


M00005539D:G0i 


CH02COH 


675 


213306 


RTA00002664F.e.07.2.P.Seq 


F 


M00027078A:B02 


CH04MAL 


676 


429566 


RTA00002668F.b.04. 1 .P.Seq 


F 


M00032907A:G04 


CH08LNH 


677 


378656 


RTA00002682F.C.09. 1 .P.Seq 


F 


M00039927A:F04 


CH09LNL 


678 


427760 


RTA00002668F.e.23.1. P.Seq 


F 


M00032940A:C02 


CH08LNH 


679 


372795 


RTA00002683F.a.06.2.P.Seq 


F 


M00040032A:B03 


CH09LNL 


680 


429340 


RTA00002666F.f. 12. 1 .P.Seq 


F 


M00032577A:C04 


CH08LNH 


681 


429822 


RTA00002668F.e. 17. 1 .P.Seq 


F 


M00032939B:E07 


CH08LNH 


682 


375224 


RTA0O00"68OF.<±22.2. P.Seq 


F 


M00039783B:A06 


CH09LNL 


683 


378347 


RTA0000268 1 F.h.07. 1 .P.Seq 


F 


M00039875D:A10 


CH09LNL 


684 


380109 


RTA00002682F.i. 1 7. 1 . P.Seq 


F 


M0003998~C:G08 


CH09LNL 


685 


379001 


RTA0O002683F.O.02.1. P.Seq 


F 


M00040097A:C12 


CH09LNL 


686 


375348 


RTA00002676F.i. 12.3. P.Seq 


F 


M00039304D:B09 


CH09LNL 


687 


377889 


RTA00002672F.C.08.2. P.Seq 


F 


M00038661 A:A07 


CH09LNL 


688 


429883 


RTA00002667F.g.05. 1 .P.Seq 


F 


M00032793A.F06 


CH08LNH 


689 


377067 


RTA00002682F.1.24. 1 .P.Seq 


F 


M00040014B:DOI 


CH09LNL 


690 


378001 


RTA0000268 I F.m.22.2. P.Seq 


F 


M000398^8D:C06 


CH09LNL 


691 


45298 


RTA00002710Fj.21.l.P.Seq 


F 


M00022433A;E02 


CH03MAH 


692 


375431 


RTA00002680F.f.03. 1 .P.Seq 


F 


M00039793D:C05 


CH09LNL 


693 


377861 


RTA0000268 1 F.m.20.2.P.Seq 


F 


M00059898A:A08 


CH09LNL 


694 


428610 


RTA00002667F.e.09.1. P.Seq 


F 


M00032766C:A04 


CH08LNH 


695 


20765 


RTA000027 1 OF.i. 1 0. 1 .P.Seq 


F 


M00022363C:G12 


CH03MAH 


696 


27601 


RTA000027 1 3F.e.23. LP.Seq 


F 


M00027314C:D09 


CH04MAL 


697 


430540 


RTA00002668F.o.20.2.P.Seq 


F 


M00033 140D:F06 


CH08LNH 


698 


381024 


RTA00002670F.h.23.2.P.Seq 


F 


M00033424B:A04. 


CH09LNL 


699 


16454 


RTA00002709F.f.07.i. P.Seq 


F 


M000065^9D:B02 


CH02COH 


700 


372898 


RTA00002670F.i.03.2.P Seq 


F 


M00033424D:H12 


CH09LNL 


701 


373681 


RTA00002671F.d.20.1. P.Seq 


F 


M0003S272D:F1 1 


CH09LNL 


702 


82260 


RTA00002684F.h.06.2.P.Seq 


F ! 


M 0004 030" B:F01 


CH09LNL 


703 


377343 


RTA00002684F.g.04.2. P.Seq 


F 


M0004030ZC:A04 


CH09LNL 


704 


374747 


J<TA00002676F.e.07.2. P.Seq 


F 


M00039286A:C06 


CH09LNL 


705 


185848 


RTA000027 1 2F.m. 1.1 . 1 . P.Seq 


F 


M000:70S0A:B01 


CH04MAL 



WO 01/02568 
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ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


706 


37431 1 


RTA00002676F.e. I 8.2. P.Seq 


F 


M000392S~C:A06 


CH09LNL 


707 


278923 


RTA00002667F.b. 10. LP.Seq 


F 


M00032726C:C01 


CH08LNH 


708 


378667 


RTA0000268lF.b.l l.2.P.Seq 


F 


M0003984-A:F06 


CH09LNL 


709 


380454 


RTA00002673F.J. 16. LP.Seq 


F 


M00039084D:D07 


CH09LNL 


710 


381576 


RTA00002670F.i.04.2.P.Seq 


F 


M00033425A:C10 


CH09LNL 


71 1 


375067 


RTA00002675F.O.03. 1 .P.Seq 


F 


M00039260C:G03 


CH09LNL 


712 


89706 


RTA00002714F.a.l Ll.P.Seq 


F 


M000277413:F09 


CH04MAL i 


713 


10583 


RTA0000271 lF.h.l Ll.P.Seq 


F 


M00023100A:E12 


CH03MAH 


714 


379982 


RTA00002682F.i.I6.LP.Seq 


F 


M00039987C.E12 


CH09LNL | 


715 


378532 


RTA00002680F.n.04.3.P.Seq 


F 


M00039828B:C05 


CH09LNL 


716 


379776 


RTA0000^680F.a.22.^.P.Seq 


F 


M00039774OA03 


CH09LNL 


717 


374136 


RTA00002673FT. 16. LP.Seq 


F 


M00039072OC03 


CH09LNL 


718 


98471 


RTA00002663F.J.2 Ll.P.Seq 


F 


M00022670D:H1 1 


CH03MAH 


719 


125365 


RTA00002668F.J.07. LP.Seq 


F 


M000330i9B:E10 


CH08LNH 


720 


375431 


RTA00002680FT.03.2. P.Seq 


F 


M00039793D:C05 


CH09LNL 


721 


62826 


RTA0000266 1 F.g.20. 1 .P.Seq 


F 


M00004105D:D05 


CHOiCOH 


722 


379972 


RTA00002679F.e. 1 0. LP.Seq 


F 


M00039672D:DIO 


CH09LNL 


723 


377554 


RTA00002679FT. 1 0. ! .P.Seq 


F 


M00039675D:B03 


CH09LNL 


724 


230479 


RTA00002664F.C. 16. 2. P.Seq 


F 


M00026915B:C06 


CH04MAL 


725 


98872 


RTA00002663 F.j. 1 9. 1 . P.Seq 


F 


M00022668B:B12 


CH03MAH 


726 


42635 


RTA00002679F.h. 1 8. LP.Seq 


F 


M00039684D:B08 


CH09LNL 


727 


379044 


RTA00002679F. a. 10.2. P.Seq 


F 


M00039652B:D05 


CH09LNL 


728 


96093 


RTA00002663F j.07. LP.Seq 


F 


M00022640C:C12 


CH03MAH 


729 


403642 


RTA00002687F.d.01.2.P.Seq 


F 


M00039945C:F09 


CH14EDT 


730 


40092 1 


RTA00002685F.b. 18.2. P.Seq 


F 


M00039371 B:H06 


CH12EDT 


731 


93587 


RTA00002663 F.k. 1 0. 1 .P.Seq 


F 


M00022731A:D02 


CH03MAH 


732 


79951 


RTA000027 1 3 F.c. 1 8. 1 .P.Seq 


F 


M00027258A:A07 


CH04MAL 


733 


176509 


RT.A000026S6Fb.09. LP.Seq 


F 


M00039756B:H06 


CH13EDT 


734 


451753 


RTA00002694F.e.06. 1 .P.Seq 


F 


M00043634A:CI0 


CH20COHLV 


735 


186266 


RTA000027 13F.c. 16. 1 .P.Seq 


F 


M000272563:H09 


CH04MAL 


736 


235052 


RTA00002692F.a. 1 5. 2. P.Seq 


F 


M00042626B:D08 


CHI SCON 


737 


377233 


RTA00002682F.e. 23. LP.Seq 


F 


M00039^40D:G08 


CH09LNL 


738 


378532 


RTA00002680F.n.04.2. P.Seq 


F 


M00039828B:C05 


CH09LNL 


739 


177932 


RTA000027 1 3 F.b.22. LP.Seq 


F 


M00027233B:C01 


CH04MAL 


740 


9332 


RTA000027 1 2F.p. 1 8. 1 .P.Seq 


F 


M00027I79D:E06 


CH04MAL 


741 


240318 


RTA00002687F.d.04.2.P.Seq 


F 


M00039947A:D06 


CH14EDT 


742 


404260 


RT.A00002687F.cl 1.2. P.Seq 


F 


M0003^942D:C01 


CH14EDT 


743 


93767 


RTA000027 1 2F.2.09. 1 .P.Seq 


F 


M00026S68C:E1 1 


CH04MAL 


744 


185642 


RTA000027 1 2F.f.20. 1 .P.Seq 


F 


M00026856D:F02 


CH04MAL 


745 


447544 


RTA000026S9F.e. 1 8. 3. P.Seq 


F 


M00042905D:D02 


CH15CON 


746 


403274 


RTA00002687F.b. 10. 2. P.Seq 


F 


M00039766A:G07 


CH14EDT 


747 


404257 


RTA00002687F.g.06. 1 .P.Seq 


F 


M00040208A:C03 


CH14EDT 


748 


403868 


RT.A00002687F.k.05.2.P.Seq 


F 


M0004031SC:H1 1 


CH14EDT 


749 


450074 


RTA0000269 1 F.e. 1 2.2. P.Seq 


F 


M00043392D:C1 1 


CH17COHLV 


750 


404520 


RTA00002687FT.05.2. P.Seq 


F 


M00040202 A: F05 


CH14EDT 


751 


451789 


RTA00002692F.b.04.2.P.Seq 


F 


M00042O56C:B06 


CHI SCON 


752 


455178 


RTA00002694F.b. 19. LP.Seq 


F 


M00043447A:C07 


CH20COHLV 




WO 01/02568 
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ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


L1BR.ARY 


753 


455136 


RTAG0002694F.a.08. 1 .P.Seq 


F 


M00042595A:B01 


CH20COHLV 


754 


379001 


RTA00002683F.O.02.2. P.Seq 


F 


M00040097A:C12 


CH09LNL 


755 


374763 


RTA00002673F.p.2 1 . 1 .P.Seq 


F 


M000391 1SB:C05 


CH09LNL 


756 


402508 


RTA00002686F.0. 15.1 .P.Seq 


F 


M000402SID:BO! 


CH13EDT 


757 


431370 


RTA00002669F.m.04.3 .P.Seq 


F 


M0003328SB-.D12 


CH08LNH 


758 


380500 


RTA00002670F.p. 19. 1 .P.Seq 


F 


M00033583B:E06 


CH09LNL 


759 


376743 


RTA00002678F.e.~ r \~>. P.Seq 


F 


M00039461 A:F04 


CH09LNL 


760 


191690 


RTA00002673F.m. 19. 1 .P.Seq 


F 


M00039I07C:E04 


CH09LNL 


761 


374264 


RTA0000267 1 F.p.2 1 .2.P.Seq 


F 


M00038620B:E09 


CH09LNL 


762 


373020 


RTA00002671 F.b.20.2. P.Seq 


F 


M00033595A:C1 i 


CH09LNL 


763 


375231 


RTA0000267 1 F.m.20.2. P.Seq 


F 


M00038387B:A07 


CH09LNL 


764 


16180 


RTA00002709F.J. I 7. 1 .P.Seq 


F 


M00006977D:A03 


CH02COH 


765 


379403 


RTA00002683F.c.l7.2.P.Seq 


F 


M00040041C:C09 


CH09LNL 


766 


375382 


RTA00002677F.d.24.2. P.Seq 


F 


M00039381D:C02 


CH09LNL 


767 


379653 


RTA00002683F.c.03.2.P.Seq 


F 


M00040038D:G04 


CH09LNL 


768 


377858 


RTA0000268 1 F.e. 14.2. P.Seq 


F 


M00039864A:A07 


! CH09LNL 


769 


430861 


RTA00002668F.h. 1 8. 1 .P.Seq 


F 


M00032995C:C05 


CH08LNH 


770 


376128 


RTA00002677F.a.l 1.2. P.Seq 


F 


M00039334B:E03 


CH09LNL 


771 


375009 


RTA00002676F.n.20.2. P.Seq 


F 


M00039322A:F04 


CH09LNL 


772 


429816 


RTA00002667F.n.22. 1 .P.Seq 


F 


M0003287 1D:E1 I 


CH08LNH 


773 


375657 


RTA0000268 1 F.h. 1 3. 2. P.Seq 


F 


M00039877C:C03 


CH09LNL 


774 


427889 


RTA00002666F.b. 14. 1 .P.Seq 


F 


M00032530D:C02 


CH08LNH 


775 


376761 


RTA00002677F.g.03.2. P.Seq 


F 


M00039391 D:F08 


CH09LNL 


776 


44025 


RTA00002684F.b.24.2. P.Seq 


F 


M000401 !5B:A04 


CH09LNL 


777 


44025 


RTA00002684F.C.O 1 .2.P.Seq 


F 


M000401 I5B:A04 


CH09LNL 


778 


392524 


RTA0000268 1 F.p.04.2. P.Seq 


F 


M00039909D:C02 


CH09LNL 


779 


427252 


RTA00002665F.b. 13.1 .P.Seq 


F 


M00028185B:A06 


CH08LNH 


780 


374927 


RTA00002673 F.e. 1 2. 1 .P.Seq 


F 


M0003906SCE06 


CH09LNL 


781 


378226 


RTA00002680F.g.09. 1 .P.Seq 


F 


M00039797C.G05 


CH09LNL 


782 


2 1 7964 


RTA00002664F.g. 08.2. P.Seq 


F 


M00027299B:B12 


CH04MAL 


783 


376368 


RTA00002677F.b. 14.2. P.Seq 


F 


M00039339A.H07 


CH09LNL 


784 


377719 


RTA00002677F.J.1 i. 2. P.Seq 


F 


M00039407B:G02 


CH09LNL 


785 


378081 


RTA00002677F.e. 1 6.2. P.Seq 


F 


M00039384C:E02 


CH09LNL 


786 


89267 


RTA00002662F.b.O 1 .2. P.Seq 


F 


M00005445D:B01 


CH02COH 


787 


374927 


RTA00002673F.e.l2.2.P.Seq 


F 


M00039068CE06 


CH09LNL 


788 


279054 


RTA00002667F.b.23. 1 .P.Seq 


F 


M00032731B:C10 


CH08LNH 


789 


377283 


RTA00002682F.m. 1 9. 1 .P.Seq 


F 


M00040016C:H12 


CH09LNL 


790 


45318 


RTA00002710F. 1.05.1. P.Seq 


F 


M00022533A:A08 


CH03MAH 


791 


1 88292 


RTA00002664F.e.23.2.P.Seq 


F 


M00027162B:F05 


CH04MAL 


792 


378872 


RTA00002683F.c.20.2.P.Seq 


F 


M00040042B:A10 


CH09LNL 


793 


427252 


RTA00002665F.b.l3.3.P.Seq 


F 


M000281S5B:A06 


CHOSLNH 


794 


380618 


RTA00002673F.J. 12.2. P.Seq 


F 


M00039084C:G07 


CH09LNL 


795 


35646 


RTA00002667F.g. 16. 1 .P.Seq 


F 


M00032797B:G02 


CHOSLNH 


796 


46407 


RTA0000:665F.c. 10. 3. P.Seq 


F 


M00028196D:A03 


CHOSLNH 


797 


373720 


RTA00002674F.C.04. 1. P.Seq 


F 


M00039124C.F03 


CH09LNL 


798 


429693 


RTA0OO0:668F.t:05. LP.Seq 


F 


M00032944B:302 


CHOSLNH 


799 


377108 


RTA0OOO:678F.p.O4.2. P.Seq 


F 


M00039636C:D1 1 


CH09LNL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


300 


375657 


RTA0000268 I F.h. 13.1 .P.Seq 


F 


M00039877C:C03 


CH09LNL 


80 i 


374868 


RTA00002673F.d.08.2.P.Seq 


F 


M00039063B:D08 


CH09LNL 


802 


428716 


RTA00002667F.e.08. 1 .P.Seq 


F 


M00032766B:D12 


CH08LNH 


803 


44025 


RTA00002684F.C.0 1 . 1 .P.Seq 


F 


M000401 15B:A04 


CH09LNL 


804 


430327 


RTA00002668F.k. 11.1 .P.Seq 


F 


M00033033C:H01 


CH08LNH 


805 


374328 


RTA00002673F.C.24. 1 .P.Seq 


F 


M00039061B:F08 


CH09LNL 


806 


376946 


RTA00002682F.n. 10. 1 .P.Seq 


F 


M00040019A:E01 


CH09LNL 


807 


375522 


RTA00002677F.n.08.2. P.Seq 


F 


M00039420D:D03 


CH09LNL 


808 


395617 


RTA00002687F.b. 1 5. 2. P.Seq 


F 


M00039767B:A04 


CH14EDT 


809 


21686 


RTA00002712F.g.05.1. P.Seq 


F 


M00026865B:A06 


CH04MAL 


810 


452033 


RTA00002692F.a.09.2. P.Seq 


F 


M00042623D:D07 


CH18CON 


31 I 


25632 


RTA0000271 1 F.g. 1 6. 1 .P.Seq 


F 


M00023042D:D02 


CH03MAH 


812 


152487 


RTA00002663F.e. 12. 1 .P.Seq 


F 


M00022181C:D01 


CH03MAH 


813 


378226 


RTA00002680F.g.09.2. P.Seq 


F 


M00039797C:G05 


CH09LNL 


814 


402446 


RTA00002686F.C.04. 1 :P.Seq 


F 


M00040133B:B03 


CH13EDT 


815 


403642 


RTA00002637F.c.24.2.P.Seq 


F 


M00039945C:F09 


CH14EDT 


816 


186359 


RTA000027 1 3F.g.24. 1 .P.Seq 


F 


M00027379C:B07 


CH04MAL 


817 


404290 


RTA0000268SF.e.04.2.P.Seq 


F 


M00040395B:D1 I 


CH14EDT 


818 


375443 


RTA00002676F.g. 19. 2. P.Seq 


F 


M00039298B:DO3 


CH09LNL 


819 


380279 


RTA00002673F.L24. 1 .P.Seq 


F 


M00039082B:A05 


CH09LNL 


820 


3861 10 


RTA00002687F.e.06. 1 .P.Seq 


F 


M00039955C:C04 


CH14EDT 


821 


380279 


RTA00002673F.J.0 1.1. P.Seq 


F 


M00039082B:A05 


CH09LNL 


822 


386986 


RTA00002675F.p.06. 1 .P.Seq 


F 


M00039266A.302 


CH09LNL 


823 


186359 


RTA000027I3F.h. 01.1. P.Seq 


F 


M00027379C.B07 


CH04MAL 


824 


37561 I 


RTA00002677F.o.20.2.P.Seq 


F 


M00039425D:E12 


CH09LNL 


825 


378285 


RTA00002679F.h.0 1 . 1 .P.Seq 


F 


M00039681B:H09 


CH09LNL 


826 


44025 


RTA00002684F.b.24. i .P.Seq 


F 


M00040I 15B:A04 


CH09LNL 


827 


25240 


RTA0000271 i F.c. !2. 1 .P.Seq 


F 


M00022854A:B03 


CH03MAH 


823 


403700 


RTA00002687F.g.03.2.P.Seq 


F 


M00040207B:D08 


CH14EDT 


829 


404679 


RTA00002687F.f.07. 1 .P.Seq 


F 


M00040203A:H06 


CH I4EDT 


830 


454806 


RTA00002693 F.b. 1 2.2. P.Seq 


F 


M00043093C:G1 1 


CH19COP 


831 


376829 


RTA00002674F.f.2 1 .2. P.Seq 


F 


M00039135D:G02 


CH09LNL 


832 


456309 


RTA00002694F.d. 1 6. 1 .P.Seq 


F 


M00043518B:D06 


CH20COHLV 


833 


374510 


RTA00002672F.i. 1 7.2. P.Seq 


F 


M00039015D:H04 


CH09LNL 


834 


377232 


RTA00002683F.m.08.2. P.Seq 


F 


M00040090B:G09 


CH09LNL 


835 


375779 


RTA00002672F.j.20.2.P.Seq 


F 


M00039025A:H09 


CH09LNL 


836 


90746 


RTA00002671F.a.07.2.P.Seq 


F 


M00033585D:AO2 


CH09LNL 


837 


453002 


RTA00002692F.b.2l.2.P.Seq 


F 


M00042970CH10 


CHI SCON 


833 


402863 


RTA00002686F.n. 12. 1 .P.Seq 


F 


M00040273B:H12 


CH13EDT 


839 


402526 


RTA00002686F.p.07. 1 .P.Seq 


F 


M000402S6C:C02 


CH13EDT 


840 


412778 


RTA0O002685F.i.07. 1 .P.Seq 


F 


M00039533D:F04 


CH12EDT 


841 


402273 


RTA00002686F.J. 13.1. P.Seq 


F 


M00040233C:G05 


CH13EDT 


842 


374744 


RTA00002670F.i. 16.1. P.Seq 


F 


M00033427D:F0l 


CHO^LNL ! 


843 


375764 


RTA00002677F.O. I 8.2. P.Seq 


F 


M00039425C:G01 


CH09LNL 


844 


428218 


RTA00002667F.C.O 1 . 1 .P.Seq 


F 


M00032731C:C07 


CH08LNH 


845 


374809 


RTA00002675F.h.0 1 . 1 .P.Seq 


F 


M00039230D:D09 


CH09LNL 


846 


20162 


RTA00002710F.n.20.1. P.Seq 


F 


M00022662D:G1 1 CH03 MAH 



WO 01/02568 
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ID 
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CLONE ID 


LIBRARY 


847 


375782 


RTA00002677F.d.23.2.P.Seq 


F 


M0003938;C:H08 


CH09LNL 


848 


372958 


RTA00002672F.C.02. 1 .P.Seq 


F 


M00038639D:F07 


CH09LNL 


849 


403940 


RTA00002688F.d.07.2. P.Seq 


F 


M0004038"D:H05 


CHI4EDT 


850 


8490 


RTA0000271 I F.g.03. 1 .P.Seq 


F 


M00023020C:G08 


CH03MAH 


851 


374809 


RTA00002675F.g.24.1. P.Seq 


F 


M00039230D:D09 


CH09LNL 


852 


377788 


RTA00002684F.g.24.2.P.Seq 


F 


M00040305C:H06 


CH09LNL 


853 


u 13847 


RTA000027 1 1 F. f.09. 1 .P.Seq 


F 


M00022976C:F04 


CH03MAH 


854 


374172 


RTA00002673F.k. 16. 1 .P.Seq 


F 


M0003909^D:D06 


CH09LNL 


855 


380314 


RTA00002682F.1.07. 1 .P.Seq 


F 


M00040009D:B07 


CH09LNL 


856 


47231 


RTA000027 14F.b. 1 5. 1 .P.Seq 


F 


M00027813C:F01 


CH04MAL 


357 


400287 


RTA00002685F.k. 10. 1 .P.Seq 


F 


| M0003958-iC:C01 


CH12EDT 


858 


400533 


RTA00002685F.a.02.2. P.Seq 


F 


M00039I8ID:E05 


CH12EDT 


859 


447594 


RTA00002689F.C.07. 1 .P.Seq, 


F 


M000426963:E05 


CHI SCON 


860 


147357 


RTA000027 1 1 F.e. 15.1 .P.Seq 


F 


M00022928B:C01 


CH03MAH 


861 


401141 


RTA00002635F.o.22:2. P.Seq 


F 


M00039642D:B12 


CH12EDT 


862 


404620 


RTA00002687F.C.03.1. P.Seq 


F 


M00039770A:G1 I 


CHI4EDT 


863 


24360 


RTA00002709F.I.20. 1 .P.Seq 


F 


M00007149A:G02 


CH02COH 


864 


380618 


RTA00002673 F.j. ! 2. 1 .P.Seq 


F 


M0003908-C:G07 


CH09LNL 


865 


448446 


RTA00002690F.d. 09.3. P.Seq 


F 


M00042797D:DI0 


CH16COP 


866 


402313 


RTA00002686F. f. 1 8. 1 .P.Seq 


F 


M0004017-LD:G03 


CHI3EDT 


867 


273151 


RTA00002685F.C. 05.2. P.Seq 


F 


M00039374C:H02 


CH12EDT 


868 


404172 


RTA00002687F.d. 1 7.2. P.Seq 


F 


M00039951B:B12 


CH14EDT 


369 


263630 


RTA000026Q4F.e. 1 0. 1 .P.Seq 


F 


M00043637C:H01 


CH20COHLV 


870 


404277 


RTA00002637F.d. 1 8. 1 .P.Seq 


F 


M0003995 i B:C03 


CH14EDT 


871 


403557 


RTA000026S7F.d. 1 0. 1 .P.Seq 


F 


M00039948A:E03 


CH14EDT 


872 


375161 


RTA00002676F.m.24.2.P.Seq 


F 


M00039319B:H12 


CH09LNL 


373 


376829 


RTA00002674F.f.2 1 . 1 .P.Seq 


F 


M00039135D:G02 


CH09LNL 


874 


37295S 


RTA00002672F.C.02.2. P.Seq 


F 


M00038639D:F07 


CH09LNL 


875 


21578 


RTA00002709F.a.24. 1 .P.Seq 


F 


MO0005351CGO5 


CH02COH 


876 


402506 


RTA0O0O2686F.b. 1 7. 1 .P.Seq 


F 


M00059760B:B08 


CH13EDT 


877 


141731 


RTA000027 1 3F.b.04. 1 .P.Seq 


F 


M00027212D:E03 


CH04MAL 


878 


3741 I 


RTA00002661F.e.l Li. P.Seq 


F 


M00003770A:E05 


CH01COH 


879 


372537 


RTA000C670F.C. 05.2. P.Seq 


F 


M00033345D:A09 


CH09LNL 


880 


380834 


RTA00002670F.C. 08.2. P.Seq 


F 


M00033346C:A05 


CH09LNL. 


881 


401492 


RTA00002685F.n. 1 7.2. P.Seq 


F 


M00039609D:F07 


CH12EDT 


882 


99998 


RTA00002662F.b. 23.2. P.Seq 


F 


M00006712C:H09 


CH02COH 


883 


4043 I 1 


RTA00002638F.d.2 1 .2. P.Seq 


F 


M00040394A:D04 


CH14EDT 


884 


231084 


RTA00002664F.C. 1 3.2. P.Seq 


F 


M00026918B:D01 


CH04MAL 


885 


447679 


RTA000026S9F.b. I 1 .3. P.Seq 


F 


M00042560A:F12 


CH15CON 


886 


377012 1 


RTA00002682F.d. 1 7. | .P.Seq 


F 


M00039936C:C05 


CH09LNL 


887 


226207 


RTA00002664F.d.2 1 .2. P.Seq 


F 


M0002703fD:C06 


CH04MAL 


888 


446183 


RTA00002689F.a. 12. l.P.Seq 


F 


M00042534A:A05 


CHI5CON 


889 


428508 


RTA00002666F.C.24. l.P.Seq 


F 


M00032545B:H09 


CH08LNH 


890 


1 57648 


RTA000027 1 4F.0.20. l.P.Seq 


F 


M00027818C:C07 


CH04MAL 


891 


404609 


RTA0000268SF.b. 15.2. P.Seq 


F 


M00040377C.G07 


CH14EDT 


892 


400464 


RTA000026S5F.1. 10. l.P.Seq 


F 


M00039590D:D02 


CH12EDT 


893 


379108 


RTA00002685F.1. 12. l.P.Seq 


F 


M0003959iC:D06 


CH12EDT 



D I 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


894 


374639 


RTA00002676F.d.2 1 .2.P.Seq 


F 


M000392S-iD:BI2 


CH09LNL 


895 


380674 


RTA00002673F.J. I4.2.P.Seq 


F 


M00039084C:H04 


CH09LNL 


896 


380674 


RTA00002673F.J. 14. 1 .P.Seq 


F 


M0005908-iC:H04 


CH09LNL 


897 


188972 


RTA00002664F.d.20.2. P.Seq 


F 


M00027030C.H06 


CH04MAL 


898 


402S35 


RTA00002686F.C.0 1 . 1 .P.Seq 


F 


M0004013 1D:G08 


CHI3EDT 


899 


403774 


RTA00002687F.d.08.2.RSeq 


F 


M00039947CG03 


CHI4EDT 


900 


374606 


RTA00002673FJ.23.2. P.Seq 


F 


M00039096A:A05 


CH09LNL 


901 


192535 


RTA00002663 F.m. 14. 1 .P.Seq 


F 


M00022925C:A08 


CH03MAH 


902 


377926 


RTA00002680F.U6.2.P.Seq 


F 


M00039820B:B06 


CH09LNL 


903 


186055 


RTA000027I2F.U 1.1. P.Seq 


F 


M00026926A:E10 


CH04MAL 


904 


380498 


RTA00002684F.i\ 1 1 .2.P.Seq 


F 


M00040129D:EIO 


CH09LNL 


905 


400236 


RTA00O02685F.L 1 8.2.P.Seq 


F 


M00039561 A:B07 


CH12EDT 


906 


401070 


RTA00002688F.d. 12.2. P.Seq. 


F 


M00040390A:H02 


CH14EDT 


907 


452622 


RTA00002692F.b. 1 4.2.P.Seq 


F 


M00042962D:C05 


CHI SCON 


908 


235052 


RTA00002692F.a. 15.1 .P.Seq 


F 


M00042626B.D08 


CHI SCON 


909 


452221 


RTA00002692F.C. 1 3.2.P.Seq 


F 


M00042986C:G12 


CHI SCON 


910 


404581 


RTA00002687F.2. 1 1 .2.P.Seq 


F 


M00040208D:G09 


CH14EDT 


91 I 


376925 


RTA00002687F.e. 1 4.2. P.Seq 


F 


M00039957C:C09 


CHI4EDT 


912 


400287 


RTA00002685F.k. 10.2. P.Seq 


F 


M00039584C:C01 


CH12EDT 


913 


403242 


RTA00002687F.I.05.2.P.Seq 


F 


M00040323B:C12 


CH14EDT 


914 


4533 13 


RTA00002693F.a.07.2.P.Seq 


F 


M00042614B:B05 


CH19COP 


915 


452633 


RTA00002692F.f. 1 1 .2.P.Seq 


F 


M00043067D:D10 


CHI SCON 


916 


447679 


RTA000026S9F.b. 1 ! . ! .P.Seq 


F 


M00042560A:F12 


CH15CON 


917 


452398 


RTA00002692F.f. 17.1. P.Seq 


F 


M00043125C:Ai 1 


CHI SCON 


918 


449797 


RTA0000269! F.b.22.3.P.Seq 


F 


M00043334B:A10 


CH17COHLV 


919 


403916 


RTA00002687F.J. 1 1 .2. P.Seq 


F 


M00040314D:H05 


CH14EDT 


920 


236906 


RTA00002693F.d.05.2.P.Seq 


F 


M00043154A:B07 


CH19COP 


921 


404161 


RTA00002687F.e.20.2. P.Seq 


F 


M00059958C:B09 


CH14EDT 


922 


3861 10 


RTA00002687F.e.06.2.PSeq 


F 


M00039955CC04 


CH14EDT 


923 


451512 


RTA0000269lF.b.02.3.P.Seq 


F 


M00043305B:G02 


CHI7COHLV 


924 


4005 1 7 


RTA00002687F.k. 1 5.2.P.Seq 


F 


M00040320D:F02 


CH14EDT 


925 


403578 


RTA00002687F.i.0 1 .2. P.Seq 


F 


M000402^6D:E09 


CH14EDT 


926 


403578 


RTA00002687F.h.24.2.P.Seq 


F 


M00040296D:E09 


CH14EDT 


927 


403371 


RTA00002687F.h. 1 9.2. P.Seq 


F 


M00040294D:D12 


CH14EDT 


928 


452531 


RTA00002692F.f. 16.1. P.Seq 


F 


M00043125A:B1 1 


CHI SCON • 


929 


454453 


RTA00002693F.f. 1 5.2. P.Seq 


F 


M00043215A:D02 


CH19COP 


930 


238270 


RTA00002692F.e.07.2.P.Seq 


F 


M0004302SA:G05 


CHI SCON 


931 


14583 


RTA00002687F.f.08.2.P.Seq 


F 


M00040203B:A05 


CH14EDT 


932 


400464 


RTA000026S5F.l.l0.2.P.Seq 


F 


M00039590D:D02 


CH12EDT 


933 


404642 


RTA00002687F.f.02.2.P.Seq 


F 


M0004020lC:Gl 1 


CH14EDT 


934 


380413 


RTA000026S0F.k. 1 9. 1 .P.Seq 


F 


M0003981 6C.D05 


CH09LNL 


935 


287963 


RTA00002693F.c.20.2.P.Seq 


F 


M00043 148C:A09 


CH19COP 


936 


20847 


RTA000027 l0F.d.09. 1 .P.Seq 


F 


M00021852D:A05 


CH03MAH 


937 


456531 


RTA00002694F.b. 18.1. P.Seq 


F 


M00043446C:E12 


CH20COHLV 


938 


450463 


RTA00002694F.3. 1 2. 1 .P.Seq 


F 


M000425^6C:D07 


CH20COHLV 


939 


456713 


RTA00002694F.d. 13.1 .P.Seq 


F I 


M000435I3D:G08 


CH20COHLV 


940 


455508 


RTA00002694F.a. 15.1. P.Seq 


F 


MOOO-i25^-B:EI2 


CH20COHLV 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


I CLONE ID 


LIBRARY 


941 


376138 


RTA00002674F.m.05.2.P.Seq 


F ' 


M00039I69A:£12 


CH09LNL 


942 


402831 


RTA00002686F.m.03. 1 .P.Seq 


F 


M00040264D:G05 


CH13EDT 


943 


373820 


RTA00002674F.d.06.2.P.Seq 


F 


M00039I27A:G1 1 


CH09LNL 


944 


85388 


RTA00002674F.c.06.2.P.Seq 


F 


M00039124C:H08 


CH09LNL 


945 


400732 


RTA00002685F.k.24.2.P.Seq 


F 


M0003958~C:F12 


CH12EDT 


946 


431629 


RTA00002669F.1. 14. 1. P.Seq 


F 


M000332768:G08 


CH08LNH 


947 


449349 


RTA00002690F.d. 12.3. P.Seq 


F 


! M00042802CC04 


CH16COP 


948 


401 124 


RTA00002685F.O.I I.2.P.Seq 


F 


M00039629D:B04 


CHI2EDT 


949 


453233 


RTA00002693 F.a.O 1 ,2.P.Seq 


F 


M000426M A:A06 


CH19COP 


950 


124813 


RTA00002685Fj.l0.2.P.Seq 


F 


M00039564B:C01 


CH12EDT 


951 


454627 


RTA00002693 F.f.09.2.P.Seq 


F 


M00043210C:E05 


CH19COP 


952 


169464 


RTA00002663 F.i. 1 9. 1 .P.Seq 


F 


M00022602A:E09 


CH03MAH 


953 


451654 


RTA00002692F.f.02.2. P.Seq 


F 


M00043044D:A09 


CHACON 


954 


406092 


RTA00002685F.k.l 1.2.P.Seq 


F 


M00039534C.CI1 


CH12EDT i 


. 955 


453501 


RTA00002693F.d.!4.2.P.Seq 


F 


M00043!62D:C12 


CH19COP 


956 


450845 


RTA00002691FT. 10. 1. P.Seq 


F 


M00043410C:A09 


CH 1 7COHLV 


957 


448 I 77 


RTA00002690F.e. 12.1. P.Seq 


F 


M00042839B:BI 1 


CHI6COP 


958 


402617 


RTA00002686F.b.2 1 . 1 .P.Seq 


F 


M00040 1 3 1 B:D 1 I 


CHI3EDT 


959 


378014 


RTA0O0026S0F.g.I7.1.P.Seq 


F 


M00039799A:DIO 


CH09LNL 


960 


124813 


RTA0O002685F.J. 1 0. 1. P.Seq 


F 


M0003956-iB:C01 


CH12EDT 


96! 


29450 


RTA00002663F.d.07. 1 .P.Seq 


F 


M00022054A:H03 


CH03MAH 


962 


400486 


RTA00002685F.e.02. 1 .P.Seq 


F 


M000394963:D08 


CH12EDT 


963 


44753 


RTA00002713F.f.05. 1 .P.Seq 


F 


M00027324D:C05 


CH04MAL | 


964 


448177 


RTA00002690F.e. ! 2.2. P.Seq 


F 


M0004283QB:B1 1 


CH16COP 


965 


447697 


RTA00002689F.e. 1 5. 3. P.Seq 


F 


M00042905A:F1 I 


CH15CON 


966 


240313 


RTA00002687F.d.04. 1 .P.Seq 


F 


M00039947A.D06 


CH14EDT 


967 


451620 


RTA0000269 1 F.d.20.3. P.Seq 


F 


M00043379D:H02 


CH17COHLV 


963 


400157 


RTA00002685F.i.20.2. P.Seq 


F 


M000395613:A09 


CH12EDT 


969 


400276 


RTA00002685F.h.l6.2.P.Seq 


F 


M000395283:312 


CH12EDT 


970 


449779 


RTA0000269 1 F.d.04.3. P.Seq 


F 


M00043367B:A08 


CH17COHLV 


971 


400157 


RTA00002685F.i.20.1. P.Seq 


F 


M00039561B:A09 


CH12EDT 


972 


238133 


RTA00002685F.e.03.2.P.Seq 


F 


M000394963:H09 


CH12EDT 


973 


452015 


RTA00002692F,c.07.2.P.Seq 


F 


M000429S1B:DI 1 


CHI SCON 


974 


400732 


RTA00002635F.I.O 1 .2.P.Seq 


F 


M00039587C.FI2 


CH12EDT 


975 


24984 


RTA00002711F.d.2 1.1. P.Seq 


F 


M00022910A:A06 


CH03MAH 


976 


449040 


RTA00002690F.e. 14.2. P.Seq 


F 


M0004284!D:H07 


CH16COP 


977 


377431 


RTA00002671F.i.l5.3.P.Seq 


F 


M00038303A;C03 


CH09LNL 


978 


400910 


RTA00002685F.b.07. 1 .P.Seq 


F 


M00039367BH02 


CH12EDT 


979 


376945 


RTA00002682F.k.23. 1. P.Seq 


F 


M00040007D:A06 


CH0 Q LNL 


980 


15906 


RTA00002709F.e. 14.1. P.Seq 


F 


M00005305D:D12 


CH02COH 


981 


452781 


RTA00002692F.b.l6.2.P.Seq 


F 


M00042966B:F07 


CHI SCON 


982 


415294 


RTA00002686F.f. 14. 1 .P.Seq 


F 


M00040173D:305 


CH13EDT 


983 


401644 | 


RTA00002685F.n. 16.1. P.Seq 


F 


M00039608D:H01 


CH12EDT 


984 


404402 


RTA0000:6S7F.a.l9.2.P.Seq 


F 


M00039761DE10 


CHI4EDT 


985 


401709 


RTA0O0O2685F.n.24.2.P.Seq 


F 


M00039624A:H09 


CHI2EDT 


986 


401644 


RTA00002685F.n. 16.2. P.Seq 


F 


M00039603D.H01 


CH12EDT 


987 


452531 


RTA00002692F.f. 16.2. P.Seq 


F 


M00043 1 25 A:3 1 I 


CHI SCON 



159 



WO 01/02568 



PCT/US00/18374 



SEC 
1 ID 
1*988 


? 

CLUSTER 
4009 1 0 


SEQ NAME 
RT AOOOO^S^ F h 07 ~> P Q^* 


\J ivi ClN 1 r\ I IKJ 

3 F 


Kit /""* f Axir r f — . 

N| CLONE ID 

1 M00039367B:H02 


, LIBRARY 
CH12EDT 


989 
990 


449235 
| 449794 


RTAOOOO^QOF ^ ~>~> ^ P 
RTA0000 n 69 1 F c ">~> "> P Ser 


1 r 


| M00042439B:B03 


CH 16COP 


991 


40092 1 


RTA0000 n 685F b IX 1 P Ser 


\ r 
? r 


1 M00043361B.A01 
1 iM0003937IB:H06 


CH17COHLV 
CHI2EDT 


992 


373874 


RT A0000~ > 677 F c 77 7 P S,*r 


1 r 


j M00038663D:H10 


CH09LNL 


993 


401050 


RT AOOOO^fiflSF e OQ ~> P S,»n 


F 


1 M00039499C:A04 


CH12EDT 


994 


453237 


rtaoooo^q^f r 07 ~> p Q**n 


F 


M00043 I08A:F06 


CH19COP 


995 


449294 


RTA000CP690F c ! "5 "? P Sr»n 


c 
r 


i M00042770C.C04 


CH16COP 


996 


404260 


RTA0000^6K7F c 1 1 1 P v>n 


r 


1 M00039942D:COI 


CH14EDT 


997 


378014 


RTAOOOO^fiKOF a 17 P 


r 


j M00039799A:D10 


CH09LNL 


998 


404726 


RTA0000~>6£XF i IX "> P v^n 

iv i nv^vwu — voor .u. i o.«..r . jCU 


c 
r 


1 M0004037IC.H05 


CH14EDT 


999 


451347 


RTA0000^691F b 11 1 P Sen 


r 


| M000433 1 i C:E03 


CH17COHLV 


1000 


40 1 1 54 


RTA0000^6£^F r> Oft 7 P Sen 


c 

i r 


1 M00039497C:C06 


CH12EDT 


1001 


401870 


RTA0000 n 6K6F 1 P <;,»n 


r 


1 M00040I 3 1C:F03 


CH13EDT 


1002 


400170 


RTA00OO^fi£5F h 0"? 7 P 


c 

r 


M00039366C:B07 


CH12EDT 


1003 


25387 


RTA0000 n 7 ! 1 P F 1 Q 1 P 


F 


M00023001C:C08 


CH03MAH 


1004 


377085 


RTA0000 n 67KF n Id I P <;^»n 


r 
r 


M000396 I 9B:D02 


. CH09LNL 


1005 


403530 


rtaoooo^k^f i no p 


F 


| M0004Q36SA:F01 


CHI4EDT 


1 1006 


372930 


RT A OOOfP A70F I 17 ~> p c^^, 


F 


| M0003343 T C:A07 


CH09LNL 


1 1007 


40 1 120 




F 


M00039379A:B03 


CH12EDT 


1 1008 


403397 


RT A 0O00~*A97F h (V> 7 P 

«v i i^uvuu-oo / r . n . u_ . r. oca 


F 


M00040219B:D02 


CH14EDT 


1009 


449337 


RTAOOOO^fSQOF r IK 1 P S,*n 


c 

r 


M00042774C:C05 


CH16COP 


1010 


403561 


RT AOOOO^fSSSF H 06 ~> P 


r 


M0004038'C:E07 


CH14EDT 


1011 


1341 82 


RT AOOOfP ftO^F H 1 1 ~> P 

iv i yj — KJ* r.Ll. 1 J . r. jCU 


IT 

r 


M000430I I A:H 12 


CHI SCON 


1012 


377085 


RT A000CP67KF n 1.1 "> P 

* V I i\ \J\JW\J — (J r Ol .11. 1 ~r . r. JCQ 


c 

r | 


M000396I9B:D02 


CH09LNL 


1013 


376138 


RTAOO0O" > ( £ i74F m OS 1 P <Z*n 


tr i 

r j 


M00039169A:EI2 


CH09LNL 


f 1014 


40 ! 1 54 


RTA0000^6SSF pO^ 1 P <;^n 


c 

r ! 


M0003949"C:C06 


CHI2EDT 


1015 


449825 


RTAOOOO^^Q IFh 14 ^ P <;^n 


r | 


M00043320B:A07 


CH17COHLV 


1016 


403896 : 


RTAOO0fP<SK7F i 04 "> P 


r | 


M00039746C.H05 


CHI4EDT 


j 1017 


377632 


RT AOOOn^^S"? F 1 ! 9 P 

rv i auvuu_oojr.i. i (j.«,.r . jcQ 


F j 


M000400S"D:F08 


CH09LNL 


10 ! 8 


450845 


RTAOOOO^^Q 1 F f 1 0 P <s^rt 


r | 


M000434 10C.A09 


CH17COHLV 


1019 


450045 


RTA0000^69 1 F e 1 0 ^ P Sen 


IT 1 

r 1 


M0004339 i A:C 10 


CH17COHLV 


1020 


402962 


RTA0000^6K6F H 1 P 


F | 


M0004014"D:HI 1 


CHI3EDT 


1021 


427674 


RTA0000^66SF i 10 1 P 

* v i r \ uvvu^uu J l .J. i u. I ,r. JCLj 


r | 


M00028 / / > D:F03 


CH0SLNH 


1022 


403252 


RTA0000^68SF c 15 "> P^n 


C 1 

r 


M00040383 D:C04 


CH14EDT 


1023 


452038 


RTA000O" ! 69' :> F n 09 I P Spn 


c I 
r | 


M00042623 D:D07 


CHI SCON 


1024 


401553 


RT A 0000 ~* 68 > F H 09 ^ P ^^r) 
IV 1 .•\uvwu_uojr,u.uo i. JCU 


r | 


M000394823:G02 


CH12EDT 


1025 


45 1092 


RTA0000"69 1 F d 17 > P Sen 


r 1 


M0004jj/ A.COj 


CHI7COHLV 


1026 


403978 


RTA0000"687F a 09 ^ P Sen 


IT I 

r j 


M0004020SB:A07 


CH14EDF 


1027 


377186 


RT A0000^68' > F m 07 I P Sen 


r 1 
r j 


M000400 UD:F03 


CH09LNL 


j 1028 


404679 


RTA00002687F.f.07.2.P.Seq 


F 


M00040203A.H06 


CH 1 4FOT 


1029 


373875 


RTA00002674F.C.05. 1 .P.Seq 


F 


M00039124C:H02 


CH09LNL 


1030 


128841 


RTA00002635F.o.!5.2.P.Seq 


F 


M00039630C:H04 


CH12EDT 


1031 


33971 


RTA00002713F.h. 13.1. P.Seq 


F | 


M00027392B:H02 


CH04MAL 


1032 


332878 


RTA00002666F.h. 13.1 .P.Seq 


F 


M000325«"C:B01 


CH0SLNH 


1033 


400781 


RTA0OO0:6S5F.j.03.2.P.Seq 


F 


M000395o23:G02 


CHI2EDT 


1034 


456456 


RTA00002694F.b.22. 1 .P.Seq 


F J 


M0004344O A ;E12 ( 


:h2ocohlv 



^0 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTFR 


SEQ NAME 


ORIF>JTATION 

1X1 bt 1 r\ 1 1 I y 


v_ U w in c. 1 D 


LltJrxARY 


1035 


407^17 




F 

r 


\AC\l\t~\ lATs"" Pv . M 

tvlUUUdUJ^ / U. H 1 U 


CH 13EDT 


1036 


40 1 974 


RTA0000^6S6F i IS 1 P Sen 


F 
r 


vtnnn -1 m - ' "* \ rns 
ivlUUUdU^ : A . v_ u^ 1 


CH 1 jEDT 


1037 


4SS 1 4 1 


RTA nOOO~>694F b 14 1 P Sen 


F 
r 


| iV1UUU4j44U<w . t5U / 


CH20COHLV 


1038 


4070S7 


RTAnO0fP6fl6F 1 14 1 P Sen 

IV 1 r\\J\J\J\J — UOUT .1. 1 1 . r 


c 

r 


lvlUUU4U^oUC_ . UU4 


CH13EDT 


1039 


407SSS 

"-rU-i J J J 


RT A 0O0O^6S6F m 14 I P Sen 
rx 1 Auuuu.oour .111. 1-+. 1 . r . ocij 


r 


,\yinnn/i m^*r/~-r^n/i 


LH I jEDT 


1 040 


406007 


RTA 000n^6RSF lv 11 1 P Sen 


F 

r 


VfAAA^OsO 1 /~* - 1 1 

ivIUUUj VjSdC^ .v, 1 1 


CH 12EDT 


1041 


J / HJ J 1 


RTA nO0fP674F i ~>0 1 P Sen 


F 
r 


IV1UUU j 7 1 4 / A. r l U 


f t_r ao 1 xr 1 


1042 


407^/;s 


RT A000n^6J?6F i OR 1 P Sen 


F 
r 


IV1UUU4U4j>U A . MU-i 


CH I jtU I 




40 1 87 R 


RTA000076S6F i 14 1 P Sen 


F 
r 


Mnnr\/im 1 1 rv 007 
1V1UUU4U4J - U. t5U / 


CH 1 jcU 1 


1044 


447AAQ 


RTA00007ASQP 1 1 S ~> P S*»n 


F 
r 


IVIUUU44^_ jd.cUo 


LHIDLUN 


1 04 S 


407 S8x 


RTA0f)0fP6£6F V \R 1 P Sen 
fx 1 nUUUU-OOOr.N. 1 O. I .r.ocu 


F 
r 


IV1UUU4U_^4 d . l_ iU 


CM 1 JtU I 


1046 


7448 S<* 


RTAnnOO~>6S6F 1 0~> 1 P Sen 


F 
r 


1VIUUU4U-. j O A . AUO 


rn 1 cht 
Lli 1 jcU 1 


1047 


4077^0 


RT A0000^6S6F i ~>0 1 P Sen 


F 
r 


IvlUUU^U-i-O A. M 1U 


Ln 1 JtU 1 


1048 


dO 1 766 


RTA0000" > 6X6F n 16 1 P Sen 
rv 1 r\\j\j\j\j uour.u. 1 lj. 1 .r .oct^ 


F 

r 


LViUUU4U4o_ A. AUJ 


C ri 1 j tU I 


1049 


4070S7 


RTA 000D^6S6F a 14 \ P Sen 


F 

r 


IV1UUU4U l 0 1 D . ri 1 U 


ru 1 Pit 
Ln btU I 


I OSO 


44Q66Q 


RTA000n~>6Q0F r 10 > P S^n 


F 
r 


\fi AAA 1 ^ 7 A "* n - f~i ! A 
ivIUUU4- / 0 / D . U 1 U 


C M 1 OLUr 


I OS 1 


400S~>0 


RTA 0000^6^ SF cr 04 "» P Sr»n 


F 

r 


iv1UUUj> 1 . UU0 


C H 1 _hU I 


i os~> 


4Uj<SOo 


R T A 000O^6^7F Lr 0> 1 P 


c 
r 


ivlUUUdUj t 3L . ri 1 I 


C H 1 4hD I 


i os^ 

l U J J 




K 1 AUvJUU-Oo / r.I.U?. 1 .r.o^q 


c 
r 


iv1UUU4Uj2j B.L 1 2 


CH14EDT 


1 OSJ. 


L+yJJ. 1 o _ 


tx 1 nU l JUU_DOOr.l. 1 O. 1 . r .jcC] 


c 
r 


\f aaa^a 1 t i/^".rr in 


CH 1 jhD 1 


l Ujj 




P T A nnOO'~ > 6Q0 F c 1 "> ^ P Q^n 


c 
r 


NilAAA.I ^7TAD.D IT 


C H 1 6COP 


1 OS£ 
1 VJ JO 


jrt i inn 
4U 1 -iVU 


rv 1 rtUUtU-Oojr.n. 1 u. i , r. jcq 


r 


viAAA -1 n/in^: - hao 


CH 1 2hD I 


i ns7 




RTA 0D00^6Q0F H 0"^ ^ P S**n 


c 
r 


N/1AAA H"QAr TA7 

tvlUUU4_ / VUL .l_U / 


U M 1 OLUr 


1 OSS 


J /'tJJ I 


RTA 0000^6 74F i ^0 ^ P S^»n 


r 


iVIUUUj V 1 4 i A . r 1 U 




1 OSO 


449JA4 


RTA 000O" ) 6Q0F r OR " P Sen 


r 


\A AAA,! ">7A > r~ • PtA 1 


v, ri I OLUr 


1 060 


dO 1 07Q 
HU III / 7 


RTA 000(T>6SSF n OS "» P Sen 


F 
r 


NyfOAA'iQA iir onj 
ivlUUU J V04 J l_ . DU4 


(~~ fj 1 ^ CPiT 

V- ri 1 _ tLJ I 


1 06 J 


dO^Q 1 6 


RTA 00(^0^6^ 7 F i 1 1 1 P Sen 


F 
r 


VfAAA 1A7 1 1 r> ■ UTA > 


Vwli I4tu I 


1062 


dO 1 ^74 
*+\J 1 J / -t 




F 
r 


\j\ A A A ^ Q A KT'Cni 


C ri 1 ~ t U 1 


] 06 i 


dOOSO^ 


RTA ClC)0C)~>£>R SF L- 0^ I P Sen 


p 
r 


\vfAAA^Q>"'An - r> IA 


CM I _ t U I 


I 064 


7 1 QJR7 S 

Z. \ 70-J 


RTA 0000^664 F h 06 ^ P Sr»n 

JV 1 nUvUU-DO'+r.ll.UO 1..3CL1 


F 
r 


\ J 1AAA17^0hn-r,A9 


r* t4A'i MAI 


1 UUJ 


^777^7 
J / / i J_ 


RTA00(^0^691 F n OQ p S^n 


p 
r 


N/AAA'IQQ 1 ATT. 1 A 
IvlUUU J VV 1 Ul_ . VJ 1 U 


fTJAQ I V- I 

LnUVL.NL 


1 066 


J Ovj40 


RTA 000("P6K4F H I"* 1 P Sen 


F 
r 


M00040 ri Q-rn> 


ruAQi X, 1 


1067 


44QS4Q 


RT A0000^690F a 09 " P Sen 


F 
r 


M0004" , 4'; 1 C ■ FO 1 


C M I OLUr 


1068 


40777^ 


RTA00(")0 n 686F r" OS 1 P Sen 


F 
r 


N/100040 ! 60 R-FOn 


C W 1 ^ FHT 


1 069 


40 1 777 

*r\J \ 1 f 


RTAOOOO^ASSF n P Sen 


F 
r 


\z1000^Q64 n n- WOO 


Lnl-CU t 


1070 


^79878 

J / /O /o 


RTAOOOO^S^F h 1 "* 1 P Sen 

IX 1 . \ U vV. W — vJO— 1 .11. 1 — . Ot J 


F 


\/ioooiQ094 a rn^ 


nil? L.>L 


107 1 




RT \00ClV68 1 F a 08 "* P Sea 


F 


M000 iQS'OP FOS 


TH001 Vf 
V— nu^ L.NL 


1072 


44S06S 


RTA00C0 n 690F c ^ ^ P Sea 


F 

r 


M0004~>7<? 1 A • A 07 


\_ ri 1 OLUr 


1073 


40^49^ 


RT A 0000^687F i 0" 1 P Sen 


F 


\/100040^ nn- C04 


CM \ JFHT 


1074 


4005 1 7 


RTA0000^6S7F k IS ! P Sen 

IX 1 i v W v ' J — U O » 1 .IV. 1 _ . 1 .I.OCLj 


F 


M00040" ^O n ■ CO" 5 




1075 


456636 


RTA00002694F.e.05. 1 .P.Seq 


F 


M0004j6j2D:F09 


CH20COHLV 


1076 


400101 


RTA000026S5F.0.04. 1 .P.Seq 


F 


M00059625 B:G08 


CHI2EDT 


1077 


403578 


RTA00C0Z687F.i.0 1 . 1 .P.Seq 


F 


M00040296D:£09 


CH14EDT 


1078 


402419 


RTA00C02686F.g.2(). 1 .P.Seq 


F 


M00040IS4C:AI 1 


CH13EDT 


1079 


375161 


RTA0OCO2676F.I1.0 1 .2. P.Seq 


F 


M00039319B:H12 


CH09LNL 


1080 


40 1 85 1 


RTA00002686F.d.07.I. P.Seq 


F 


M00040I43A:H05 


CH13EDT 


1081 


400567 


RTA00C02685F.a. 14.:. P.Seq 


F 


M0003936IB:E0l 


CH12EDT 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


f CLONE ID 


LIBRARY 


1082 


376641 


RTA00002677F.d.0 1 .2. P.Seq 


F 


iV10003934f A:D09 


CH09LNL 


1083 


376641 


RTA00002677F.c.24.2.P.Seq 


F 


M00039345A:D09 


CH09LNL 


1084 


400450 


RTA00002685F.J.22. 1 .P.Seq 


F 


M00039570A:D10 


CH12EDT 


1085 


375373 


RTA00002676F.h. ! 2. 1 .P.Seq 


F 


M0003930CCC09 


CH09LNL 


1086 


375373 


RTA00002676F.h. !2.2.P.Seq 


F 


M00039300C:C09 


CH09LNL 


1087 


413643 


RTA00002685F.n.05.2.P.Seq 


F 


M00039604D:G03 


CHI2EDT 


1088 


448874 


RTA00002690F.C.02.3. P.Seq 


F 


M00042759B:G1 1 


CH16COP 


1089 


37651 1 


RTA00002674F.h.04. 1 .P.Seq 


F 


M00039!40A:B08 


CH09LNL 


1090 


374040 


RTA00002674F.h.2 1 . 1 .P.Seq 


F 


M00039l42D:Bi 1 


CH09LNL 


1091 


454132 


RTA00002693F.e.I8.I.P.Seq 


F 


M0004319! A:A07 


CH19COP 


1092 


4G4581 


RTA00002687F.g. 1 1 . 1 .P.Seq 


F 


M00040208D:G09 


CH14EDT 


1093 


260521 


RTA00002689F.C. 13.1 .P.Seq 


F 


M000427023:G02 


CH15CON 


1094 


379564 


RTA00002687F.0. 1 2. 1 .P.Seq 


F 


M00040346A:Cl 1 


CH14EDT 


1095 


452491 


RTA00002692F.f.05.2. P.Seq 


F 


M00043046D:B1 1 


CH18CON 


1096 


403541 


RTA00002687F.p.20.2. P.Seq 


F 


M00040364.A:E05 


CH14EDT 


1097 


404636 


RTA00002688F.b.l 1.2. P.Seq 


F 


M00040376C:G02 


CH14EDT 


1098 


379564 


RTA00002687F.0. 12.2. P.Seq 


F 


M00040346A:CI 1 


CHI4EDT 


1099 


451548 


RTA00002691F.b.0Q.3.P.Seq 


F 


M000433 1CC:G06 


CH17COHLV 


I too 


454308 


RTA00002693F.f. 14.1. P.Seq 


F 


M00043213 3:B12 


CH19COP 


1101 


40 1 1 84 


RTA00002685F.d.04.2.P.Seq 


F 


M0003938CC:C09 


CH12EDT 


1 102 


401290 


RTA00002685F.n. 1 0.2. P.Seq 


F 


M000396063:D08 


CH12EDT 


I 103 


400101 


RTA00002685F.O.04. 2.P.Seq 


F 


M00039625 3:G08 


CHI2EDT 


1 104 


454308 


RTA00002693 F.f. 1 4.2. P.Seq 


F 


M00043213B:B12 


CHI9COP 


I 105 


452622 


RTA00002692F.b. 14. LP.Seq 


F 


M00042962D:C05 


CHI8CON 


1 106 


450012 


RTA0000269 1 F.d. 09.3. P.Seq 


F 


M000433703:C08 


CH17COHLV 


1 107 


400503 


RTA00002685F.k.02.2.P.Seq 


F 


M000395703:D10 


CH12EDT 


1 108 


400450 


RTA00002 685 F j . 22 . \ P. Seq 


F 


M00039570A:D10 


CH12EDT 


I 109 


446166 


RTA00002689F.C. IT. 1. P.Seq 


F 


M0004271 : 3: A I 1 


CH15CON 


1 1 10 


456233 


RTA00002694F.e.0S. 1. P.Seq 


F 


M0004363c3:C06 


CH20COHLV 


1 1 1 1 


25443 


RTA000027 lOF.d. 1 5. i .P.Seq 


F 


M0002!86cD:A03 


CH03MAH 


1 1 12 


404119 ; 


RTA00002688F.d. I 7.2. P.Seq 


F 


M00040392C:BI2 


CHI4EDT 


1113 


403642 


RTA00002687F.d.O 1 . 1 .P.Seq 


F 


M00039945C:F09 


CH14EDT 


I 1 14 


403493 


RTA00002637Fj.03.2.P.Seq 


F 


M0004031JD:E04 


CH14EDT 


i 1 15 


454132 


RTA00002693F.e. 1 8. 2. P.Seq 


F 


M00043191 A.A07 


CHI9COP 


I 1 16 


450607 


RTA0000269 1 F.d. 1 2. 3. P.Seq 


F 


M00043372C:G05 


CH17COHLV 


1117 


451718 


RTA00002692F.e.24.2.P.Seq 


F 


M00043044 3:A12 


CHI8CON 


1 1 18 


453907 


RTA00002693F.b.08.2.P.Seq 


F 


M0004308"3:G07 


CH19COP 


1 1 19 


447669 


RTA00002689F.a. 1 5. 3. P. Seq 


F 


M0004253S3:E06 


CH15CON 


1120 


404044 


RTA00002687F.p.l 1. 1. P.Seq 


F 


M0004035iD:Al 1 


CH14EDT 


1121 


449617 


RTA00002690F.e. 1 6. 2. P.Seq 


F 


M00042849D:Fl 1 


CH16COP 


1 122 


452723 


RTA00002692F.e.l 8.2. P.Seq 


F 


M00043036C:E05 


CHI SCON 


I 123 


270014 


RTA00002685F.L 15.2. P.Seq 


F 


M00039536C:H1 1 


CH12EDT 


1124 


401 198 


RTA00002685F.i. 14.2. P.Seq 


F 


M0003 t >536C:CiO 


CH12EDT 


1 125 


452414 


RTA00002692F.e. 1 2. LP.Seq 


F 


M0004303ZC:A10 


CH 18CON 


I 126 


453019 


RTA00002692F.d.iS.2. P.Seq 


F 


M00043015 A:H10 


CHI8CON 


1 127 


403642 


RTA00002687F.C.24. 1 .P.Seq 


F 


M00039945C:F09 


CH14EDT 


I 128 


401437 


RTA00002685F.C. I S. 2. P.Seq 


F 


M00039377D:E12 


CH12EDT I 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1 129 


452414 


RTA00002692F.e. 12.2. P.Seq 


F 


M00043032C:A10 


CH18CON 


1130 


404122 


RTA000026S7F.n. 1 0. 1 .P.Seq 


F 


M00040334D:B02 


CH14EDT 


1131 


400567 


RTA00002685F.a. 1 4. 1 .P.Seq 


F 


M00039361B:E01 


CH12EDT 


1 132 


401437 


RTA00002685F.C. 1 8. 1 .P.Seq 


F 


M00039377D:E12 


CHI2EDT 


1133 


404642 


RTA00002687FT.02. 1 .P.Seq 


F 


tM00040201C:GI 1 


CH14EDT 


1134 


376007 


RTA00002676FT.22.2. P.Seq 


F 


M00039293B:C1 1 


CH09LNL 


1135 


402835 


RTA00002686F.b.24. 1 .P.Seq 


F 


M00040131D:G08 


CHI3EDT 


1136 


403774 


RTA00002687F.d.08. 1 .P.Seq 


F 


M00039947CG03 


CH14EDT 


1137 


45505 


RTA000027 1 2F.d.04. 1 .P.Seq 


F 


M00023377B:F0l 


CH04MAL 


1138 


452071 


RTA00002692F.c.05.2.P.Seq 


F 


M00042979B:E02 


CHI SCON 


1139 


449832 


RTA0000269 1 F.e. 13. 1 .P.Seq 


F 


M00043393A:B08 


CH17COHLV 


1140 


379004 


RTA00002683F.n.09.2. P.Seq 


F 


M00040093B:C02 


CH09LNL 


1141 


455211 


RTA00002694F.b.07. 1 .P.Seq 


F 


M00043430B:C02 


CH20COHLV 


1142 


379021 


RTA00002683F.n. 13.2. P.Seq 


F 


M00040093D:D03 


CH09LNL 


1143 


376279 


RTA00002680F.d. 1 0.2. P.Seq 


F 


M00039785D:G05 


CH09LNL 


1 144 


374373 


RTA0000268 1 F.n.2 1 . 1 .P.Seq 


F 


M00039903A:H07 


CH09LNL 


1145 


97668 


RTA00002686F.d.!9.l.P.Seq 


F 


M00040145D:D03 


CH13EDT 


1 146 


400407 


RTA00002635F.a.05.2. P.Seq 


F 


M00039184A:D03 


CH12EDT 


1147 


402904. 


RTA00002686F.n. 1 5. 1 .P.Seq 


F 


M00040274A:H1 1 


CH13EDT 


1148 


403912 


RTA000026S7F.J. 1 9. 1 .P.Seq 


F 


M00040317A:H03 


CH14EDT 


1149 


400511 


RTA00002685F.b.23.2. P.Seq 


F 


M00039372C:D12 


CH12EDT 


1150 


402746 


RTA00002686F.a. 1 4. 1 .P.Seq 


F 


M00039740B:F10 


CH13EDT 


1 151 


403849 


RTA00002687F.n.09.2. P.Seq 


F 


M00040333D:G05 


CHI4EDT 


1 152 


401471 


RTA00002635F.0. 1 0. 1 .P.Seq 


F 


M00039629B:F0I 


CH12EDT 


1153 


404362 


RTA00002687F.O.06.2. P.Seq 


F 


M00040342B:D12 


CH14EDT 


1154 


373641 


RTA00002677F.L09.2. P.Seq 


F 


M00039403A:G12 


CH09LNL 


1155 


401952 


RTA00002686F.J. 10. 1 .P.Seq 


F 


M00040231B:C03 


CH13EDT 


1 156 


400685 


RTA00002685F.m. 09.2. P.Seq 


F 


M000395^7D:F04 


CH12EDT 


1157 


402689 


RTA000O2686F.n.O5. 1 .P.Seq 


F 


M00040271B:E12 


CH13EDT 


1 158 


380462 


RTA00002670F.O.0 1 .2. P.Seq 


F 


M000335~OB:E06 


CH09LNL 


1159 


400078 


RTA00002685F.m. 1 5.2. P.Seq 


F 


M00039600A:Al I 


CH12EDT 


1160 


373748 


RTA0000267 1 F.L06.3.P.Seq 


F 


M00038325D:F12 


CH09LNL 


1 161 


401392 


RTA00002685F.f.08.2.P.Seq 


F 


M00Q39505C:E03 


CH12EDT 


1162 


20548 


RTA000027 1 OF.h. 15. 1 .P.Seq 


F 


M0002224"~A:E02 


CH03MAH 


1163 


376279 


RTA00002680F.d. 1 0. 1 .P.Seq 


F 


M00O39785D:G05 


CH09LNL 


1 164 


374428 


RTA00002672F a.20. 1 .P.Seq 


F 


M00038633B:G02 


CH09LNL 


1 165 


374428 


RTA00002672F.a.20.2. P.Seq 


F 


M0003863 3B:G02 


CH09LNL 


1 166 


372914 


RTA00002679F.j.2 ! . 1 .P.Seq 


F 


M00039696A:E05 


CH09LNL 


1167 


378320 


RTA00002681F.1. 14.2. P.Seq 


F 


M00039894C:H07 


CH09LNL 


1 168 


235422 | 


RTA00002665F.h. 19.1. P.Seq 


F 


M0002876SC:D05 


CHOSLNH 


1169 


402473 


RTA00002686F.p. 11.1 .P.Seq 


F 


M000402S"C:B09 


CH13EDT 


1170 


374828 


RTA00002674F.m. 10. 1 .P.Seq 


F 


M00039!70A:B10 


CH09LNL 


1 171 


403912 


RTA00002687FJ. 19.2. P.Seq 


F 


M0004031"A:H03 


CH14EDT 


1 172 


401471 


RTA00002685F.0. 10.2. P.Seq 


F 


M0003962OB:F01 


CH12EDT 


1 173 


404362 


RTA00002637F.0. 06. 1. P.Seq 


F 


M00040342B:D12 


CH14EDT 


1 174 


403849 


RTA00002687F.n.09.1. P.Seq 


F 


M00040333D:GO5 


CH14EDT 


I 175 


395617 


RTA00002687F.b.l5.l.P.Seq 


F 


M0003976"B:A04 


CHI4EDT 



9i 



WO 01/02568 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


i 176 


401709 


RTA000026S5F.o.01.2.P.Seq 


F 


M00039624A:H09 


CH12EDT 


1177 


404464 


RTA00002637F.O.22. l.P.Seq 


F 


M00040347D:F09 


CH14EDT 


1 178 


447795 


RTA00002689F.e.06.3. P.Seq 


F 


M00042895C:G01 


CHI5CON 


1 179 


18139 


RTA0O002708F.f. 10. l.P.Seq 


F 


M00004i39B:B10 


CH01COH 


1 180 


403898 


RTA00002687F.a.05. 1 .P.Seq 


F 


M00039746C:H06 


CH14EDT 


1181 


453512 


RTA00002693 F.a.2 1 .2. P.Seq 


F 


M00043078D:D04 


; CH19C0P 


1182 


404172 


RTA00002687F.d. 1 7. 1 .P.Seq 


F 


M0003995IB:BI2 


CH14EDT 


1 183 


400973 


RTA00002685F.c.06.2.RSeq 


F 


M00039374CH12 


CHI2EDT 


1184 


450198 


RTA0000269 1 F.e.23.2.P.Seq 


F 


M00043405A.DI 1 


CH17COHLV 


I 185 


451502 


RTA0000269 1 F.f.03.2.P.Seq 


F 


M00043406B:G12 


CH17COHLV 


1 186 


454414 


RTA00002693 F.f. 1 8.2.RSeq 


F 


M00043220B:C04 


CH19COP 


1 187 


453752 


RTA00002693F.b.02.2.P.Seq 


F 


M0004308ID:F05 


CH19COP 


1188 


403700 


RTA00002687F.g.03. 1 .P.Seq 


F 


M00040207B:D08 


CH14EDT 


1 189 


403371 


RTA00002687F.h. 1 9. 1 .P.Seq 


F 


M00040294D:D12 


CH14EDT 


1190 


14583 


RTA00002687F.f.08. l\P.Seq 


F 


M00040203B:A05 


CH14EDT 


1 191 


404161 


RTA00002637F.e.20. 1 .P.Seq 


F 


M00039958C:B09 


CH14EDT 


1192 


403274 


RTA000026S7F.b. 1 0. 1 .P.Seq 


F 


M00039766A:G07 


CH14EDT 


1 193 


373465 


RTA0000267 1 F.o.09. 1 .P.Seq 


F 


M00038615A:H12 


CH09LNL 


1 194 


402582 


RTA000026S6F.m.08. l.P.Seq 


F 


M00040265D:C08 


CH13EDT 


1 195 


402241 


RTA00002686F. 1.16. l.P.Seq 


F 


M00040261CF01 


CH13EDT 


I 196 


380451 


RTA00002670F.p. 1 2. 1 .P.Seq 


F 


M0003353ID:D08 


CH09LNL 


1 197 


455938 


RTA00002694F.d.24. 1 .P.Seq 


F 


M00043528C:A02 


CH20COHLV 


1 198 


374297 


RTA00002672F.i.02.2. P.Seq 


F 


M000390I3D:F02 


CH09LNL 


1199 


402624 


RTA000026S6F.p. ! 3. 1 .P.Seq 


F 


M000402S7D:D07 


CH13EDT 


1200 


402322 


RTA000026S6F.J. 16. 1 .P.Seq 


F 


M00040233A:H02 


CH13EDT 


1201 


449504 


RTA00002690F.C.I !. 2. P.Seq 


F 


M0004276QC:E09 


CH16C0P 


1202 


226704 


RTA00002664F.a. 11.1 .P.Seq 


F 


M00023352D:H05 


CH04MAL 


1203 


271092 


RTA00002690F.b.23.2.P.Seq 


F 


iV100042756D:AlO 


CH16COP 


1204 


400864 


RTA000026S5F.g. 1 7.2.P.Seq 


F 


M00039517B:G12 


CH12EDT 


1205 


235855 


RTA0000266~F.o.06. 1 .P.Seq 


F 


M00032876C:D06 


CH08LNH 


1206 


402789 


RTA00002686F.g. 1 6. 1 .P.Seq 


F 


M00040133A:F07 


CH13EDT 


1207 


19826 


RTA000027 1 0F.k.05. 1 .P.Seq 


F 


M0002246~C:B12 


CH03MAH 


1208 


380157 


RTA00002632F.h. 19. l.P.Seq 


F 


M00039984D:G12 


CH09LNL 


1209 


401 187 


RTA000026S5F.e. 1 5.2. P.Seq 


F 


M00039500C:C04 


CHI2EDT 


1210 


427346 


RTA00002665F.b.0 1.3. P.Seq 


F 


M00028066C:D07 


CHOSLNH 


121 ! 


402 S66 


RTA0000^6S6F.c. 1 5. 1 .P.Seq 


F 


M00040I38B:H03 


CH13EDT 


1212 


376712 


RTA00002677F.C. 13. 2. P.Seq 


F 


M00039343B:F12 


CH09LNL 


1213 


401655 


RTA000026S5F.C. 22. l.P.Seq 


F 


MOO039378D:HO ' 


CH12EDT 


1214 


400147 j 


RTA000026S5F.g. 10. l.P.Seq 


F 


M00039515A:A06 


CHI2EDT 


1215 


400864 


RTA00002685 F.g. 1 7. 1 .P.Seq 


F 


M00039517B:GI2 


CH12EDT 


1216 


451600 


RTA0000269 1 F.b. 1 9. 3. P.Seq 


F 


M00043328D:H02 


CH17COHLV 


1217 


400147 


RTA000026S5F.g. 10.2. P.Seq 


F 


M000395I5A;A06 


CH12EDT 


1218 


401655 


RTA0000^6S5F.c.^.2.P.Seq 


F 


M0003937SD:H0" 


CH12EDT 


1219 


449307 


RTA00002690F.a. 10. 3. P.Seq 


F 


M0004243ID:C10 


CH16C0P 


1220 


403121 


RTA000O26SSF.a.0 1.2. P.Seq 


F 


M00040366A:B01 


CH14EDT 


122! 


451718 


RTA000026 t =>2F.e.24. l.P.Seq 


F 


iM00043044B:A12 


CHI SCON 


1222 


294345 


RTA000026S5F.g. 14. 1 .P.Seq 


F 


M000395!5D:Ci i 


CH12EDT 
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1223 


I 186541 


RTA 000027 12F.p.23.2. P.Seq 


F 


M00027181D:A05 


CH04MAL 


1224 


403898 


RTA00002687F.a.05.2.P.Seq 


F 


M00039746C:H06 


CH14EDT 


1225 


403541 


RTA00002687F.p.20. I .P.Seq 


F 


M00040364A:E05 


CH14EDT 


1226 


450773 


RTA0000269 1 F.d.24.3. P.Seq 


F 


M00043jS3D:A02 


CH17C0HLV 


1227 


376236 


RTA00002685F. 1.24.2. P.Seq 


F 


MO0O39595C:E05 


CH12EDT 


1228 


422357 


RTA00002688F.C.2 1 . 1 .P.Seq 


F 


M00040385C:D02 


CH14EDT 


1229 


404532 


RTA00002687F.p. 10.2. P.Seq 


F 


M000403513:F02 


CHI4EDT 


1230 


403693 


RTA00002687F j.23. 1 .P.Seq 


F 


M00040317D:F02 


CH14EDT 


1231 


403693 


RTA00002687F.j.23.2. P.Seq 


F 


M00040317D:F02 


CH14EDT | 


1232 


40 1 5 1 5 


RTA00002685F.o.02.2.P.Seq 


F 


M0003962-tB:FI2 


CH12EDT 


1233 


404532 


RTA00002687F.p. 1 0. 1 .P.Seq 


F 


M0004035 i 3:F02 


CH14EDT 


1234 


452077 


RTA00002692F.d.0 1 .2. P.Seq 


F 


M00043002A:E05 


CHI SCON 


1235 


18003 


RTA000027 1 1 F.b.04. 1 .P.Seq 


F 


M0002282 1C:C09 


CH03MAH 


1236 


377014 


RTA00002682F.f. 13. 1 P.Seq 


F 


M00039973D:C08 


CH09LNL 


. 1237 


404232 


RTA00002687F.n. 12.2.P.Seq 


F 


M00040334D:C07 


CH14EDT 


1238 


404232 


RTA0000">687F.n. P. 1 .P.Seq 


F 


M00040334D:C07 


CH14EDT 


1239 


406263 


RTA00002685F.d. 14.1. P.Seq 


F 


M00039493A:C04 


CH12EDT 


1240 


452077 


RTA00002692F.C.24.2. P.Seq 


F 


iV100045002A:E05 


CHI SCON 


1241 


454349 


RTA00002693F.c.09.2.P.Seq 


F 


M00043 1 33 3 :C 1 1 


CH19C0P 


1242 


44767 1 


RTA00002689F.C P. 1 .P.Seq 


F 


M0004290aB:E07 


CH15C0N 


1243 


447603 


RTA00002693F.b. 14.2. P.Seq 


F 


M00043095A:F09 


CH19COP 


1244 


456764 


RTA00002694F.C. 14. 1 .P.Seq 


F 


M00043465B:H02 


CH20COHLV 


1245 


401827 


RTA00002686F.1. 1 9. 1 .P.Seq 


F 


M00040262B:B06 


CH13EDT 


1246 


404520 


RTA00002687FT.05. 1 .P.Seq 


F 


M 00040202 A :F05 


CH14EDT 


1247 


449798 


RTA0000269 1 F.d.02.3. P.Seq 


F 


M00043366A:A02 


CH17COHLV 


1248 


450993 


RTA00002691F.C. 12.3. P.Seq 


F 


M00043350D:B1 1 


CHI7COHLV 


1249 


377471 


RTA00002691F.C.02.3. P.Seq 


F 


M00043339A:F1 1 


CH17COHLV 


1250 


400404 


RTA00002686F.a. 1 7. 1 .P.Seq 


F 


M00039752B:G08 


CH13EDT 


1251 


19106 


RTA0000269 1 F.e.08.2.P.Seq 


F 


M00043389C:E03 


CHI7C0HLV 


1252 


404024 


RTA00002687F.e. 1 8. 1 .P.Seq 


F 


M00039958A:A08 


CHI4EDT 


1253 


446404 


RTA00002689F.b. 14. 1 .P.Seq 


F 


M00042566C.C05 


CH 15CON 


1254 


392921 


RTA00002677F.k. 12. 2. P.Seq 


F 


M000394I !C:E07 


CH09LNL 


1255 


376850 


RTA00002678F.e. 10.2. P.Seq 


F 


M00039458B:HM 


CH09LNL. 


1256 


45301 1 


RTA00002692F.f. 1 0.2. P.Seq 


F 


M00043066B:H1 1 


CHI SCON 


1257 


23481 1 


RTA0000269IF.a.03.3.P.Seq 


F 


M00042352D:C0l 


CH17COHLV 


1258 


402708 


RTA00002686F.m. 11.1 .P.Seq 


F 


M0004026"A:E06 


CH13EDT 


1259 


451013 


RTA00002691F.f.08.2. P.Seq 


F 


' M0004340QB:B03 


CH17COHLV 


1260 


45301 1 


RTA00002692F.f. 10.1. P.Seq 


F 


M00043066B:H1 1 


CHI SCON 


1261 


380462 


RTA00002670F.n.24.2. P.Seq 


F 


M000335"OB:E06 


CH09LNL 


1262 


379602 


RTA0000268 1 F.c.2 i .2. P.Seq 


F 


M0003^S55C.F01 


CH09LNL 


1263 


403896 


RTA00002687F.a.04. 1 .P.Seq 


F 


M00039746C:H05 


CHI4EDT 


1264 


403397 


RTA00002687F.h.02. 1 .P.Seq 


F 


M00040:i9B:D02 


CH14EDT 


1265 


271723 


RTA00002686F.b.05. 1 .P.Seq 


F 


M00039755A:308 


CH13EDT 


1266 


451379 


RTA0000269 1 F.b. 1 2. 2. P.Seq 


F 


M000433i2C:E08 


CH 1 7COHL V 


1267 


456624 


RTA00002694F.e.02. 1 .P.Seq 


F 


M000436I63:F02 


CH20COHLV 


126S 


375483 


RTA00002686F.n. 14. 1 .P.Seq 


F 


M00040:"4A:D07 


CH13EDT 


1269 


402229 


RTA00002686F.L09.1. P.Seq 


F 


M00040221A:G! 1 


CH13EDT 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE :d 
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1270 


377039 


RTA00002686F.0. 1 1 .P.Seq 


F 


iV10004028OC.H05 


CH13EDT 


1271 


18041 


RTA000027 1 0F.h.2 I . I .P.Seq 


F 


M00022262DG03 


CH03MAH 


1272 


401381 


RTA00002685F.O.08. 1 .P.Seq 


F 


M00039626D:F04 


CH12EDT 


1273 


428491 


RTA00002666F.C.05.1. P.Seq 


F 


M00032535D:HOI 


CH08LNH 


1274 


54656 


RTA0000266 1 F.i.22.2. P.Seq 


F 


M000043~2B:F07 


CH01COH 


1275 


379183 


RTA00002679F.i. 1 7. 1 .P.Seq 


F 


M000396S8C.G06 


CH09LNL 


1276 


25594 


RTA0000271 1 F.f.07. 1 .P.Seq 


F 


M00022968B:E02 


CH03MAH 


1277 


403355 


RTA00002687F.CL1 1.1. P.Seq 


F 


M00039943D:D1 I 


CH14EDT 


1273 


16789 


RTA00002709F.b.09.2. P.Seq 


F 


M00005382B:F08 


CH02COH 


1279 


23292 


RTA00002708F.C. 02.1. P.Seq 


F 


M00003750D:E06 


CH01COH 


1280 


373982 


RTA00002673F.b.24.2.P.Seq 


F 


M0003905SA:A04 


CH09LNL 


128! 


373982 


RTA00002673F.C.O 1 .2. P.Seq 


F 


M0003905SAA04 


CH09LNL 


1282 


44991 1 


RTA0000269 1 F.e.02.2.P.Seq 


F 


M000433S^B:302 


CH17C0HLV 


1283 


450633 


RTA0000269 1 F.f.02.2.P.Seq 


F 


M0004340fC:Gl2 


CHI 7COHLV 


1284 


23939 


RTA0000271 3 F.j. 14.1. P.Seq 


F 


M00027486A:F06 


CH04MAL 


1285 


450633 


RTA0000269 1 F.f.02. 1 .P.Seq 


F 


M00043405C:G12 


CH17C0HLV 


1286 


379122 


RTA00002672F.n. 14. 1 .P.Seq 


F 


M0003903^B:F09 


CH09LNL 


1287 


449429 


RTA00002690F.a. 16. 3. P.Seq 


F 


M0004243-A:D04 


CH16C0P 


1288 


430578 


RTA00002668F.2. 18. 1 .P.Seq 


F 


M000329S-^C:G05 


CH08LNH 


1289 


425324 


RTA00002687F.0. 1 7. 1 .P.Seq 


F 


M00039~6~C:E12 


CHI4EDT 


1290 


425824 


RTA00002687F.b. 1 7.2. P.Seq 


F 


M0003976~C:E12 


CH14EDT 


1291 


401266 


RTA00002685F.U 1.2. P.Seq 


F 


M00039535D:D10 


CH12EDT 


1292 


377949 


RTA00002674F.p.04. 1 .P.Seq 


F 


M00039200A:C10 


CH09LNL 


1293 


12926 


RTA000027 lOF.e.2 1 . 1 .P.Seq 


F 


M00022005C.C06 


CH03MAH 


1294 


373242 


RTA00002679F.c.20.2.P.Seq 


F 


M0003966^D:G07 


CH09LNL 


1295 


401781 


RTA00002686F.e.08. 1 .P.Seq 


F 


M00040 1 60 B: A 10 


CH13EDT 


1296 


453101 


RTA00002693F.C. 1 6. 2. P.Seq 


F 


M00043U3 3:A10 


CH19C0P 


1297 


377592 


RTA00002677F. 1.12. 2. P.Seq 


F 


M000394';f D:E0I 


CH09LNL 


1298 


404340 


RTA00002687F.b.05. 1 .P.Seq 


F 


M000397o-C:D07 


CH14EDT 


1299 


400968 


RTA00002685F.H.0 1 .2. P.Seq 


F 


M0003952: D H03 


CH12EDT 


1300 


400968 


RTA000026S5F.g. 24.2. P.Seq 


F 


M000395:; D:H03 


CH12EDT 


1301 


374417 


RTA0000267 I F.j. 1 5. 3. P.Seq 


F 


M000383 1 5C: Gl 1 


CH09LNL 


1302 


374621 


RTA00002675F.p,02. 1 .P.Seq 


F 


M0003926JD-A12 


CH09LNL 


1303 


19063 


RTA00002703F.i. 1 4. 1 .P.Seq 


F 


M0000436! A:H02 


CHOICOH 


1304 


135941 


RTA000027 1 3F.g.06. 1 .P.Seq 


F 


M0002735^3:G05 


CH04MAL 


1305 


403355 


RTA00002637F.d.l 1.2. P.Seq 


F 


M000399^SD:D1 I 


CH14EDT 


1306 


375226 


RTA00002677F.m.0S.2.P.Seq 


F 


M0003941 "C:A0I 


CH09LNL 


1307 


222658 


RTA00002664F.e. 14.2. P.Seq 


F 


M000271 03 3 :A09 


CH04MAL 


1308 


447978 


RTA00002690F.d.l 1.3. P.Seq 


F 


M00042800A:A03 


CH16C0P 


1309 


431346 


RTA00002660F.g.24. 1 .P.Seq 


F 


M000332iSA:C04 


CHOSLNH 


1310 


455579 


RTA00002694F.a. 10.1 .P.Seq 


F 


M0004:>^c3:F06 


CH20COHLV 


1311 


13406 


RTA00002709F.L 14. 1 .P.Seq 


F 


M0000712-DH10 


CH02COH 


1312 


378364 


RTA00002674F.0. 1 7. 1 P.Seq 


F 


M0003^1^cD A07 


CH09LNL 


1313 


373788 


RTA000():67 I F.c. 1 6.2. P.Seq 


F 


M0003S25^A:G0S 


CHO^LNL 


1314 


403548 


RTA00002683F.a.I0.2.P.Seq 


F 


M000403o5D:E09 


CH14EDT 


1315 


22425 


RTA00O02709F.c.0S.:.P.Seq 


F 


M000054OSA:H0'6 


CH02COH 


1316 


452238 


RTA00002692F.C.21.:. P.Seq 


F 


M00042°°SA.G04 


CHI SCON 
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1317 


446680 


RTA00002689F.C.04. 1 P.Seq 


F 


M00042693D:E04 


CH15CON 


1318 


142922 


RTA000027 1 2F.g.02. 1 .P.Seq 


F 


M0002686OB:C05 


CH04MAL 


1319 


450196 


RTA0000269 1 F.c. 1 9.3. P.Seq 


F 


M00043359B:DI0 


CHI7COHLV 


1320 


26017 


RTA00002709F.d.04. 1 .P.Seq 


F 


M00005601D:D08 


CH02COH 


1321 


380355 


RTA00002670F.O.06. 1 .P.Seq 


F 


M00033570C:C10 


CH09LNL 


1322 


25232 


RTA000027 1 OF.n.22. 1 .P.Seq 


F 


M00022667D:B02 


CH03MAH 


1323 


378952 


RTA00002683 F.h. 11.1 .P.Seq 


F 


M00040070B:B07 


CH09LNL 


1324 


404487 


RTA00002687F.C. I 3.2. P.Seq 


F 


M000399a3B:FI0 


CH14EDT 


1325 


48482 


RTA000027 1 2F.p.06. 1 .P.Seq 


F 


M00027i59D:F03 


CH04MAL 


1326 


373705 


RTA00002673F.a. 13. 1. P.Seq 


F 


M00039052C:F07 


CH09LNL 


1327 


373705 


RTA00002673F.a. 13.2. P.Seq 


F 


M00039052C:F07 


CH09LNL 


1328 


21 162 


RTA00002709F.C.03. 1. P.Seq 


F 


M000054-9B:D01 


CH02COH 


1329 


15203 


RTA000027 lOF.a.2 1 . 1 .P.Seq 


F 


M00007972B:H12 


CH03MAH 


1330 


21162 


RTA00002709F.C.03.2. P.Seq 


F 


M00005449B:D01 


CH02COH 


1331 


401013 


RTA00002685F.o.l6.2.P.Seq 


F 


M0003964! A:A05 


CH12EDT 


1332 


404449 


RTA00002687F.C.04.2. P.Seq 


F 


M00039770C:E04 


CH14EDT 


1333 


429672 


RTA00002668F.b. 1 0. 1 .P.Seq 


F 


M00032909A:B06 


CH08LNH 


1334 


48541 


RTA000027 1 2F.L07. 1 .P.Seq 


F 


M00026922C:B02 


CH04MAL 


1335 


378424 


RTA0000268 1 F.a.03.2. P.Seq 


F 


M00039839B:B01 


CH09LNL 


1336 


49540 


RTA000027 1 2F.d.24. 1 .P.Seq 


F 


M00023399C:E10 


CH04MAL 


1337 


379170 


RTA00002672F.L2 1.1. P.Seq 


F 


M00039016D:G06 


CH09LNL 


1338 


179540 


RTA00002683F.O.20.2. P.Seq 


F 


M00040IOOC:E05 


CH09LNL 


1339 


451269 


RTA00002691F.f. 1 1.1. P.Seq 


F 


M000434! IB: DOS 


CH17COHLV 


1340 


449832 


RTA0000269 1 F.e. 1 3.2. P.Seq 


F 


M00043393A:B08 


CH 1 7COHLV 


1341 


3801 19 


RTA00002670F.m.20.2.P.Seq 


F 


M00033560D:G07 


CH09LNL 


1342 


153094 


RTA000027 14F.a. 1 2. 1 .P.Seq 


F 


M00027743A:C03 j 


CH04MAL 


1343 


448749 


RTA00002690F.d. 14.2. P.Seq 


F 


M00042S06C:F07 


CH16COP 


1344 


448749 


RTA00002690F.d. 14. 3. P.Seq 


F 


M000^2S06C:F07 


CH16COP 


1345 


454816 


RTA00002693F.b. 16.1. P.Seq 


F 


M00043096A:G04 


CH19COP 


1346 


374744 


RTA00002670F.L 16.2. P.Seq 


F 


M0003342~D:F01 


CH09LNL 


1347 


404449 


RTA00002687F.C.04. 1 .P.Seq 


F 


M000397"0C:E04 


CH14EDT 


1348 


58005 


RTA00002661 F.h. 14.1. P.Seq 


F 


M00004222C:E03 


CH01COH 


1349 


451379 


RTA00002691F.b.l2.3.P.Seq 


F i 


M000433 12C:E08 


CH I 7C0HLV 


1350 


456323 


RTA00002694F.d.2 1 . 1 .P.Seq 


F 


M000435263:D10 


CH20COHLV 


1351 


455957 


RTA00002694F.C. 1 5. 1 .P.Seq 


F 


M00043465C:A03 


CH20COHLV 


1352 
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F 
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374722 
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F 
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F 
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1355 
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1356 
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F 
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1357 


378000 


RTA0000268 1 F.j. 16.2. P.Seq 


F 


M0003988~D:C04 


CH09LNL 


1358 


448356 


RTA00002690F.c.03.3.P.Seq 


F 


M00042760A:C12 


CH16COP 
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456629 


RTA00002694F.d.04. 1 .P.Seq 


F 


M0004349IC:F04 


CH20COHLV 


1360 


431346 


RTA0000:669F.g.24.2.P.Seq 


F 


M0003321SA:C04 
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377206 


RTA00002682F.m. 1 4. 1 .P.Seq 


F 
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453036 
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F 


M00042960D-.H08 
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F 
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230532 


RTA00002664F.C.1 1.2.P.Seq 


F 


M00026901 A:G07 


CH04MAL 


1365 


30755 


RTA0OOO2663 F.e.03. 1 .P.Seq 


F 


M000221 38 A: EOS 


CH03MAH 


136.6 


451438 


RTA0000269 ! F.d.23.3. P.Seq 


F 


M00043383C:F12 


CH17COHLV 


1367 


37901 1 


RTA0000268 1 F.n.23. 1 .P.Seq 


F 


M000399O3C:D01 


CH09LNL 


1368 


404048 


RTA00002687F.g.O 1 . 1 .P.Seq 


F 


M00040206A:A07 


CH14EDT 


1369 


404048 


RTA00002687F.g.0 1 .2.P.Seq 


F 


M00040206A:A07 


CH14EDT 


1370 


452398 


RTA00002692F.f. 1 7.2.P.Seq 


F 


M00043125C:A11 


CHI SCON 


1371 


403686 


RTA0O002687F.d.03. i .P.Seq 


F 


M00039946B:F08 


CH14EDT 
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403686 


RTA00002687F.d.03.2.P.Seq 


F 


M00039946B:F08 


CH14EDT 


1373 


404048 


RTA00002687F.f.24.2.P.Seq 


F 
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CH14EDT 


1374 


404048 


RTA00002687F.f.24.1. P.Seq 


F 
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F 
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CH19COP 
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374736 
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F 
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378912 


RTA00002672F.n.01.2.P.Seq 


F 


M00039036C:B05 


CH09LNL 


1396 


134877 


RTA00002662F.d.05.2. P.Seq 


F 
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CH02COH 


1397 


372811 


RTA00002670F.C. 1 2.2. P.Seq 


F 


M0003334"C:F02 


CH09LNL 


1398 


373296 


RTA00002672F.e.08.2. P.Seq 


F 
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CH09LNL 


1399 


373296 


RTA00002672F.e.08. l.P.Seq 


F 


M0003899JA:A10 


CH09LNL 
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452903 


RTA00002692F.f.08.2. P.Seq 


F 
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450067 
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F 
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376107 
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F 
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450580 


RTA0O002691F.C.20.3. P.Seq 


F 


M00043359C:GOl 


CH17C0HLV 


1413 


379942 


RTA00002679F.I.2 1 . 1 .P.Seq 


F 


M00039707A:D02 


CH09LNL 


1414 


375589 


RTA00002680F.f.06. 1 .P.Seq 


F 


M00039794A:E04 


CH09LNL 


1415 


375789 


RTA00002674F.a. 16.1. P.Seq 


F 


M00039120C:H03 


CH09LNL 


1416 


456227 


RTA00002694F.C. 1 6. 1 .P.Seq 


F 


M00043465CC09 


CH20COHLV 


1417 


455852 


RTA00002694F.a.02. 1 .P.Seq 


F 


M00042592A:H10 


CH20COHLV 


1418 
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RTA000027 IOF.m.05. 1 .P.Seq 


F 


M00022579C:C11 


CH03MAH 


1419 


376524 


RTA00002678F.h.23.2.P.Seq 


F 


M00039477A:B03 


CH09LNL 


1420 


449562 


RTA00002690F.b. 1 3.2. P.Seq 


F 


M00042515C:F08 


CH16C0P 


1421 


449562 


RTA00002690F.b. 13.3. P.Seq 


F 
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CHI6C0P 


1422 


286001 


RTA00002690F.b.08.2. P.Seq 


F 
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CH16C0P 


1423 
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RTAOO002690F.b.08.3. P.Seq 


F 


M000425I IA:H04 


CH16C0P 


1424 


380322 


RTA000O2683F.p.2 1 . 1 .P.Seq 


F 


M00040106B:309 


CH09LNL 


1425 


401603 


RTA00002635F.f.23.2.P.Seq 


F 


M000395I0C:G02 


CH12EDT 


1426 


376541 


RTA00002678F.d. 1 3.2.P.Seq 


F 


M00039456A:C08 


CH09LNL 


1427 


449123 


RTA00002690F.a. 1 3.3. P.Seq 


F 


M00042435A:A1 i 


CH16COP 


1428 


418358 


RTA00002686F.m.07. 1 .P.Seq 


F 


M00040265D:B07 


CH13EDT 


1429 


380263 


RTA00002689F.a.22. 1 .P.Seq 


F 


M00042543C.G04 


CHI5C0N 


1430 


455748 


RTA00002694F.b.06. 1 .P.Seq 


F 


M00043428D:G08 


CH20COHLV 


1431 


451679 


RTA00002693F.a.04.2.P.Seq 


F 


M000426I2D:F06 


CH19COP 


1432 


396332 


RTA00002686F.k. 14.1. P.Seq 


F 


M00040252C:C06 


CH13EDT 


1433 


377578 


RTA00002683F.b. 1 1 .2. P.Seq 


F 


M00040037A:E1 1 


CH09LNL 


1434 


20061 


RTA000027 lOF.m. 14. 1 .P.Seq 


F 


M00022597D:A06 


CH03MAH 


1435 


402494 


RTA00002686F.h. 1 6. 1 .P.Seq 


F 


M00040191 A:B09 


CH13EDT 


1436 


372798 


RTA00002670F.C. 1 8.2. P.Seq 


F 


M00033349D:F05 


CH09LNL 


1437 


236295 


RTA00002679F.a. 1 9.2.P.Seq 


F 


M00039655B:H09 


CH09LNL 


1438 


451570 


RTA0000269 1 F.c.03.3.P.Seq 


F 


M00043340B:H08 


CH17COHLV 


1439 


35847 


RTAOOOO2708F.h.03 . 1 .P.Seq 


F 
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CHOICOH 


1440 


455706 


RTA00002694F.b. 1 0. 1 .P.Seq 


F 


M00043433B:G09 
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1441 


346310 


RTA00002684F.d. 1 8. 1 .P.Seq 
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M00040122D:A02 


CH09LNL 


1442 


189561 


RTA00002676F.J.09.3. P.Seq 


F 


M0003930SB:G08 
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1443 


403200 


RTA00002687F j.24. 1 .P.Seq 


F 


M000403 13A:B02 


CH14EDT 


1444 


401413 


RTA00002685F.i.03.2.P.Seq 


F 


M00039530B:E02 


CHI2EDT 


1445 


448680 


RTA00002690F.b.02.3. P.Seq 


F 


M00042440B:E09 


CH16C0P 


1446 


1 17060 


RTA00002679F.h.24. 1 .P.Seq 


F 


M00039686C:C05 


CH09LNL 


1447 


403200 


RTA00002687F.j.24.2.P.Seq 


F 


M000403 18A:B02 


CH14EDT 


1448 


448589 


RTA00002690F.a.07.3. P.Seq 


F 


M00042349D:D07 


CH16C0P 


1449 
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RTA00002674F.O.02. 1 .P.Seq 


F 
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1450 
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F 
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373111 
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F 
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12350 
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CH14EDT 
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F 
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1457 
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F 
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LHU-LOH 


*) 1 


T A 1 AO 

10198 


R i AUUUU-y-jr.).U9- 1 .r.Seq 


rr 

r 


\ .fAAA^l^A i r— . n^.i 

M UOO 9 _ v4C : b 09 


PUAAI N'T 


-) ^ 




R I AUU(J02oV4r .p. 12. l.r.^eq 


rr 

r 


X fAAAA lACCH.PiAs 


CrlUiCUH 


26 


12227 i 


R I AU(JU(J2909r.e. IS. 1 .P.Seq 


rr 

F 


X (AAATOZ. A f A /*^A^ 

M0002260 1 B : G0b 


TT A X f » IT 


2 / 


1 1047 


o T* s AA^/^ion^r _ a^ loo 

RTA00002S9jF.o. 06. I.P.Seq 


r^ 

F 


M0000j960D:C l_ 


CHO 1COH 




1870 


n T" \ AAAA^r\ i /™\r" ao i r> c 

RTA000029 lOF.m. OS. I.P.Seq 


F 


M000_j020C:H0j 


at rA'' \ f \ T T 


29 


20065 


RT.-\0000_90SF.m. 09. I.P.Seq 


F 


M00022-91.\:A0S 


CHOjMAH 


30 


19454 


RT AUOO02900F.m. 2 j». I.P.Seq 


F 


M000O53"9A:D10 


CH0_LGH 


! j 1 


48048 


t-> t> » AAAAio"»Tr i i ri c* 

RTA000029^2F.m.l j. I.P.Seq 


r- 

F 


M0OOj9I24D:H01 


t r a a r v ""T 

CHUvL>jL 


32 


19799 | 


RTA0000_90bF.n. 19. I.P.Seq 


r^ 

F 


X fAAA1"> i i AH I~AO 

M0002 ^-1-19 D : FOb 


T T A -< \ f \ XT 

CHOjMAH 


33 


ISj:>62 


R I AUUUU-y 1 lr.m.U / . 1 .r.ieq 


r 


X ,f AAAO 7AA" 1 A .LJi^"' 

ivi UUU _ / U v J 1 A: HO- 


TjA i \,f v I 
L HU4MAL 


i 1 

J-r 


242 14 


K i AUUuU-bv Ir.K. 19. t. r.Seq 


IT 

r 


NlUUUUj o4L):r0 


pun i lj 


JO 


D 172 


O n~ xAAAATnAOC 11 r D C ^ ^ 

K I AUUUU29Uor .p. 22. 1 .r.Seq 


rr 

r 


X A AAAn A C "> C TD . HnA 

M0UU222 «2d:UU9 


rr a "? \,r a TJ 

L riUJ iVL^Arl 


36 


50495 


F> T" \ AAAA1 oner ^ I 1 r> c 

K I AUUU02^9c>r.c. 16. I.P.Seq 


rr 

F 


X fAAAA 1 "1 I f~~" I 1 

M00004j_ IC:C 1 1 


p^ TT A 1 (~\ L7 

LHU LL Uri 


"3 "7 
J < 


43287 


TIT* \ AAAA^AAOr 1 ,£ 1 o C 

RTA0000_906F.k. 16. I.P.Seq 


F 


M000224 0D:B0_ 


z^ 1 TLT A "* X ,f A T_J ' 

CHOjMAH 


38 


15324 


RTAOOOO290^F.p.20. 1 .P.Seq 


F 


M0002 lc l ) C:B0 


P* r rA 1 \ r \ TT 

CHOjMAH 


39 


22157 


RTA00002SS5F.2.0. .I.P.Seq 


F 


M0000 146 1D:B 10 


CHOICOH 


40 


15249 


RTA00C029 15F.1.0S. I.P.Seq 


F 


M000324S9B:G12 


CH08LNH 


41 


2764 


RTA000029-DF.C. 1 1. 1 .P.Seq 


F 


M000j9j>29B:E01 


CHG9LNL 


42 


23838 


RTA00002bb9F.b. 14. i.P.Seq 


F 


M 0000 1? L^B:D;0 


CHO iL-OH 


43 


1 1074 


R T AOOOG2i599F.g.22. I.P.Seq 


F 


M00004c0jC:C 10 


CHO 1 LUH 


44 


18367 


RTA00OU-9--F.D.09. I.P.Seq 


F 


M000-^o .9D:C l- 


CH09LNL 




_ l / U J 


Ix I a UU Uv - V v. r . m .UO . i.i. jlC| 




t> 1 UUUU I'. VD.U U 




46 


21470 


RTA00002S95F.C. 14. 1 .P.Seq 


F 


M000040o7B:D03 


CHOICOH 


47 


15492 


RT A00O0 290" F. p. 0b. 1 . P.Seq 


F 


M000222S2B:C09 


CHOjMAH 


48 


4022 


RTA00002S9"F.i.22. I.P.Seq 


F 


M000042o9B:B0-i 


CHOICOH 


49 


21579 


R T AO0OO ZS9iF.e.03. I.P.Seq 


F 


M000OlcS6B:H01 


CHOICOH 


50 


1S62S3 


RTA00G029 i 3F.C.06. I.P.Seq 


F 


M0002"S.:.1B:D0" 


CH0-MAL 
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51 


5410 


RTA0000292 i F.i.08. 1 .P.Seq 


F 


M00033445D:G03 


CH09LNL 


52 


22420 


RTA0000290 1 F.e. 19. 1 .P.Seq 


F 


M00005474C:H09 


CH02COH 


53 


140553 


RTA000029 l6F.n.02. 1 .P.Seq 


F 


M0003263SB:F02 


CH08LNH 


54 


23849 


RTA00002S87F.a.22. 1 .P.Seq 


F 


M000013S6B:F1 1 


CH01COH 


55 


21945 


RTA00002895F.1.22.1. P.Seq 


F 


M00004103C:E10 


CH01COH 


56 


7867 


RTA0000290lF.p.08.1. P.Seq 


F 


M00005710B:H03 


CH02COH 


57 


14533 


RTA00002896F.1.0 1 . 1 .P.Seq 


F 


M00004179C:B06 


CHOICOH 


58 


5790 


RTA000029 19F.g. 17. 1 .P.Seq 


F 


M00033080C:A07 


CH08LNH 


59 


186153 


RTA0000291 lF.i.24.1. P.Seq 


F 


M00027017A:B09 


CH04MAL 


60 


10561 


RTAOO002899F.h.O8. 1 .P.Seq 


F 


M00004606D:H09 


CHOICOH 


61 


24572 


RTA000O2893F.1. 08.1. P.Seq 


F 


M00003926A:F11 


CHOICOH 


62 


13138 


RTA000028SSF.m.03. 1 .P.Seq 


F 


M00001488C:A03 


CHOICOH 


63 


| 6701 


RTA00002922F.2. 18. 1 .P.Seq 


F 


M00039055C:A01 


CH09LNL 


64 


12751 


RTA00002904F.C. 10. 1 .P.Seq 


F 


M00007202B:F01 


CH02COH 


65 


3583 


RTA000029 1 6F.n.2 1 . 1 .P.Seq 


F 


M00032644C:B05 


CH08LNH 


66 


12673 


RTA0000290 IF.d.24. 1 .P.Seq 


F 


M00005463A:G02 


CH02COH 


67 


15243 


RTA0000290 1F.1.2 1 . 1 .P.Seq 


F 


M00005623B:G01 


CH02COH 


68 


21022 


RTA00002922F.k.24. 1 .P.Seq 


F 


M00039111A:C12 


CH09LNL 


69 


36596 


RTA000029 19F.g.24. 1 .P.Seq 


F 


M00033081D:D11 


CH08LNH 


70 


4932 


RTA00002S90F.C. 14. 1 .P.Seq 


F 


M00001596A:D02 


CHOICOH 


71 


42413 


RTA00002900F.0. 14. 1 .P.Seq 


F 


M00005401D:F09 


CH02COH 


72 


1090 


RTA000029 1 SF.g.20. 1 .P.Seq 


F 


M00032S92C:C12 j 


CH08LNH 


73 


44737 


RTA0000290lF.a.20.1. P.Seq 


F 


M00005434A:C03 


CH02COH 


74 


4183 


RTA000029 1 SF.n.23. 1 .P.Seq 


F 


M0003298SB:G01 


CH08LNH 


75 


41882 


RTA00002902F.d. 12.1 .P.Seq 


F 


M000065S6D:D04 


CH02COH 


76 


500 


RTA00002925F.0. 18. 1 .P.Seq 


F 


M00040034A:E06 


CH09LNL 


77 


5435 


RTA00002921F.f.20.1. P.Seq 


F 


M00033420B:E08 


CH09LNL 


78 


15829 


RTA00002900F.J.01 . 1 .P.Seq 


F 


M00005314A:G10 


CH02COH 


79 


154083 


RTA00002907F.J.06. 1 .P.Seq 


F 


M00022096D:A03 


CH03MAH 


80 


24381 


RTA000029 lOF.i. 16. 1 .P.Seq 


F 


M00022953B:D06 


CH03MAH 


81 


107940 


RTA00002930F.f.07.1. P.Seq 


F 


M00055735A:HO8 


CH15CON 


82 


24761 


RTA00002902F.1.2 1 . 1 .P.Seq 


F 


M00006756C:A02 


CH02COH 


83 


10734 


RTA00002924F.e. 02.1. P.Seq 


F 


M00039457D:C02 


CH09LNL 


84 


40540 


RTA00002S97F.p.23. 1. P.Seq 


F 


M00004303C:C05 


CHOICOH 


85 


23692 


RTA00002930F.L07.1. P.Seq 


F 


M00056057C:F06 


CH15CON 


86 


7896 


RTA00002906F.j.08.1. P.Seq 


F 


M00021998B:D09 


CH03MAH 


87 


243S7 


RTA00002896F.e.09. 1 .P.Seq 


F 


M00004151B:A07 


CHOICOH 


88 


2420 


RTA00002SS9F.h.02.1. P.Seq 


F 


M00001546B:C11 


CHOICOH 


89 


10431 


RTA00002SS7F.p.07. 1 .P.Seq 


F 


M00001429B:G05 


CHOICOH 


90 


14665 


RTA0000290SF.g. 07.1. P.Seq 


F 


M00022425A:C09 


CH03MAH 


91 


10302 


RTA00002906F.O.03. 1 .P.Seq 


F 


M00022081A:B07 


CH03MAH 


92 


28436 


RTA0000290SF.t".08. 1. P.Seq 


F 


M00022415C:D12 


CH03MAH 


93 


17S29 


RTA000028S9F.2. 1 1 . 1 .P.Seq 


F 


M00001544B:E06 


CHOICOH 


94 


10390 


RTA00002906F.e. 13. 1 .P.Seq 


F 


M00021923A:B 12 


CH03MAH 


95 


11619 


RTA000029 13F.C.07. 1 .P.Seq 


F 


M00027S06C:H05 


CH04MAL 


96 


6890 


RTA000029 1 SF.m. 19. 1 .P.Seq 


F 


M00032979D:H07 


CHOSLNH 


97 


10110 


RTA00002S97F.S.13.1. P.Seq 


F 


M00004245C:G10 


CHOICOH 


98 


21511 


RTA00002S92F.h. 24.2. P.Seq 


F 


M00003S21C:E12 


CHOICOH 


99 


9287 


RTA00002S99F.h. 14. 1. P.Seq 


F 


M0000460SA:C10 


CHOICOH 


100 


16575 


RTA0OO02S95F.n. 02.1. P.Seq 


F 


M00004U0D:F09 


CHOICOH 
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LOi 


16857 


RTA00002S99F.h. 17. LP.Seq 


F 


M00004609A:E09 


CHOICOH 


102 


33329 


RTA00002S9 IF.p. 14. LP.Seq 


F 


M000037S7D:AIO 


CHOiCOH 


103 


40652 


RTA00002896F. t'.2 1 . LP.Seq 


F 


M00004153B:E03 


CHOICOH 


104 


8070 


RTA00002925F.C.09. 1 .P.Seq 


F 


M0003982SB:H06 


CH09LNL 


105 


15880 


RTA00002887F.2. 1 L LP.Seq 


F 


M00001397C:H08 


CHOICOH 


106 


87418 


RTA0000290 LF.g. 10. LP.Seq 


F 


M00005500A:D04 


CH02COH 


107 


9961 


RT A00002903F.1. 2 1. LP.Seq 


F 


M00007046D:C09 


CH02COH 


108 


9966 


RTA000029 1 1F.O.20. LP.Seq 


F 


M00027168B:H08 


CH04MAL 


109 


17513 


RTA00002906F.d.03. 1 .P.Seq 


F 


M0002I896D:A05 


CH03MAH 


110 


24835 


RT A00002924F. f . 1 8 . 1. P.Seq 


F 


M00039554D:B09 


CH09LNL 


111 


15200 


RTA00002891FJ.08. 1 .P.Seq 


F 


M00003758B:F06 


CHOICOH 


112 


124098 


RTA00002905F.e.23. LP.Seq 


F 


. M00008020D:D05 


CH03MAH 


113 


3786 


RTA00002901F.e.05. LP.Seq 


F 


M00005466C:B01 


CH02COH 


114 


154121 


RT A00002906F. p. 02 .LP.Seq 


F 


M0002208SB:F10 


CH03MAH 


115 


5746 


RTA000029 1 8F.1.09. LP.Seq 


F 


M00032945D:B07 


CHOSLNH 


116 


33700 


RTA0000290 lF.e.09. 1 .P.Seq 


F 


M00005468A:C04 


CH02COH 


117 


5660 


RT A00002 890F. b. 11. LP.Seq 


F 


M00001591B:H05 


CHOICOH 


118 


22732 


RTA00002924F.O.23. LP.Seq 


F 


M000397S5C:H12 


CH09LNL 


119 


14720 


RTA00002892F.J.05. LP.Seq 


F 


M00003825A:H10 


CHOICOH 


120 


13658 


RTA00002S96F.C.04. 1 .P.Seq 


F 


M00004143B:B04 


CHOICOH 


121 


23150 


RTA00002887F.a.O L LP.Seq 


F 


M00001384A:A07 


CHOICOH 


122 


11970 


RTA00002903F.k.03. LP.Seq 


F 


M00007002C:A10 


CH02COH 


123 


10686 


RTA00002915F.p.02.2.P.Seq 


F 


M00032519D:F08 


CHOSLNH 


124 


9588 


RTA00002923F.m. 10. LP.Seq 


F 


M00039331B:F09 


CH09LNL 


125 


8500 


RTA00002925F.e.2 1 . 1 .P.Seq 


F 


M00039860D:B02 


CH09LNL 


126 


8615 


RTA00002907F.1. 17. 1 .P.Seq 


F 


M0002223SC:G04 


CH03M.AH 


127 


7524 


RTA00002S86F.e. 16. LP.Seq 


F 


M00001348B:B03 


CHOICOH 


128 


325 


RTA000029 12F.g.02. 1 .P.Seq 


F 


M00O27332B:H09 


CH04MAL 


129 


10214 


RTA00002S89F.O.01. LP.Seq 


F 


M00001570A:B07 


CHOICOH 


130 


23534 


RTA00002S89F.C.22. 1 .P.Seq 


F 


M00001533D:A01 


CHOICOH 


131 


7473 


RTA00002S93F.O.20. LP.Seq 


F 


M00003965D:D11 


CHOICOH 1 


132 


185625 


RTA000029 12F.f. 10. 1 .P.Seq 


F 


M00027314D:E02 


CH04M.AL 


133 


3920 


RTA000029 17F.m.07. LP.Seq 


F 


M00032773D:F08 


CHOSLNH 


134 


8458 


RTA00002SS9F.m.02. 1 .P.Seq 


F 


M00001562D:B07 


CHOICOH 


135 


20263 


RTA000O29O6F.n.O8. LP.Seq 


F 


M00022073C:C07 


CH03MAH 


136 


186141 


RTA000029 12F.f. 04. LP.Seq 


F 


M00027311A:H09 


CH04MAL 


137 


4852 


RTA000029 19F.L15. LP.Seq 


F 


M00033150B:E02 


CHOSLNH 


138 


2146 


RTA00002926F.a.2 L LP.Seq 


F 


M00040061C:C08 


CH09LNL 


139 


3522 


RTA00002S97F.h.23. LP.Seq 


F 


M00004263C:D03 


CHOICOH 


140 


20027 


RTA00002909F.g.2 1 . 1 .P.Seq 


F 


M00022627B:H03 


CH03M.AH 


141 


15650 


RTA00002925F.2.04. LP.Seq 


F 


M00039S74A:B06 


CH09LNL 


142 


21031 


RTA0000290 lF.h.2 1 . 1 .P.Seq 


F 


M00005o20B:H05 


CH02COH 


143 


95610 


RTA00002909F.h. 1 8. 1 .P.Seq 


F 


M00022642A:G08 


CH03M.AH 


144 


903 


RTA000029 12F.O.03. 1 .P.Seq 


F 


M00027591A:E04 


CH04MAL 


145 


17284 


RTA000029 L6F.k. 18. LP.Seq 


F 


M00032620B:F06 


CHOSLNH 


146 


15556 


RTA00002S95F.m.O L LP.Seq 


F 


M00004104A:Ai2 


CHOICOH 


147 


11013 


RTA00002S97F.b.OS. LP.Seq 


F 


M00004214D:A05 


CHOICOH 


148 


15358 


RTA00002903F.n. IS. LP.Seq 


F 


M0000709SA:E10 


CH02COH 


149 


10792 


RTA00002S94F.a. 10. LP.Seq 


F | 


M00003974C:E11 


CHOICOH 


150 


25507 


RTA0000290 1 F.n.07. 1. P.Seq 


F 


M00005643D:A05 


CH02COH 
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151 


19978 


RTA00002908F.k.2 1 . i .P.Seq 


F 


M00022472D:B0l 


CH03MAH 


152 


31133 


RTA00002905F.f.23.1.P.Seq 


F 


M00008045CA05 


CH03MAH 


153 


4503 


RTA00002S93F.h. 14. 1 .P.Seq 


F 


M00003906A:C02 


CH01COH 


154 


9017 


RTA0O0O2886F.i.03.1. P.Seq 


F 


M00001358A:E08 


CH01COH 


155 


6635 


RTA00002895F.a.07. 1 .P.Seq 


F 


M00004057D:G01 


CH01COH 


156 


10220 


RTA0000292 1 F.d.09. 1 .P.Seq 


F 


M00033360C:A03 


CH09LNL 


157 


9831 


RTA00002896F.b.08. 1 .P.Seq 


F 


M00004139B:F01 


CH01COH 


153 


17291 


RTA00002883F.C. 10. 1 .P.Seq 


F 


M00001442A:F08 


CH01COH 


159 


9802 


RTA000029 16F.1. 12. 1 .P.Seq 


F 


M00032628CB06 


CH08LNH 


160 


133828 


RTA00002908F.b.2 1 . 1 .P.Seq 


F 


M00022374C:E11 


CH03MAH 


161 


25870 


RTA00002909F.m.23.1. P.Seq 


F 


M00022702D:E02 


CH03MAH 


162 


27324 


RTA000029 1 2F. L22. 1 .P.Seq 


F 


M00027433B:D12 


CH04MAL 


163 


98159 


RTA000029 lOF.f. 19. 1. P.Seq 


F 


M00022897B:F06 


CH03MAH 


164 


21264 


RTA00002903F.d. 15.1 .P.Seq 


F 


M00006904D:A02 


CH02COH 


165 


186199 


RTA00002911F.b.l3.2.P.Seq 


F 


M00023394D:D10 


CH04MAL 


166 


28794 


RT A00002 8 87F. i . 14. 1 .P.Seq 


F 


M00001403C:B03 


CH01COH 


167 


23180 


RTA00002895F.J .17.1 .P.Seq 


F 


M00004093A:C03 


CH01COH 


168 


21022 


RTA00002922F.I.0 1 . 1 .P.Seq 


F 


M00039111A:C12 


CH09LNL 


169 


14370 


RTA00002893F.1. 17. 1 .P.Seq 


F 


M00003911CA09 


CH01COH 


170 


4804 


RTA000029 i SF.a. 19. 1 .P.Seq 


F 


MOOO32826C:D10 


CH08LNH 


171 


7066 


RTA000029 19F.O.07. 1 .P.Seq 


F 


M00033246A:H12 


CH08LNH 


172 


48227 


RTA00002903F.0. 18.1. P.Seq 


F 


M00007117A:C11 


CH02COH 


173 


20171 


RTA00002S86F.i.I5.1.P.Seq 


F 


M00001359A:H10 


CH01COH 


174 


10555 


RTA00002S94F.p. 10.1. P.Seq 


F 


M00004055C:B10 


CH01COH 


175 


12523 


RTA000029 14F.rn.08. 1 .P.Seq 


F 


M0OO2836lB:H0S 


CH08LNH 


176 


23767 


RTA00002S96F.L2 1 . 1 .P.Seq 


F 


M00004171B:B03 


CH01COH 


177 


16849 


RT A000029 1 8F.b.07. i .P.Seq 


F 


M00032829D:A05 


CH08LNH 


178 


185866 


RTA00002911F.c.l8.2.P.Seq 


F 


M0002681SC:E01 


CH04MAL 


179 


29927 


RTA00002899F.b.20.1. P.Seq 


F 


M00004443C:F07 


CH01COH 


180 


21975 


RTA00002902F.a.0 1 . 1 .P.Seq 


F 


M00005743D:A12 


CH02COH 


181 


24456 


RT A00002903F.b.20. 1 .P.Seq 


F 


M00006877C:F11 


CH02COH 


182 


6034 


RTA0000290 lF.a. 12. 1 .P.Seq 


F 


M00005423A:Cil 


CH02COH 


183 


11362 


RTA00002SS7F.h.06. 1 .P.Seq 


F 


M00001399C:A01 


CH01COH 


134 


20671 


RTA00002905F.a.22. 1 .P.Seq 


F 


M000O7947A:B06 


CH03MAH 


185 


8059 


RT A000029 1 7F.b.02 . 1 .P.Seq 


F 


M00032671B:D06 


CH08LNH 


186 


12037 


RT A00002S97F.d. 11.1 .P.Seq 


F 


M00004229B:B06 


CHOICOH 


187 


13209 


RTA00002897F.d.20. 1 .P.Seq 


F 


M00004230D:B05 


CH01COH 


188 


23660 


RTA000029 15F.L2 1. 1 .P.Seq 


F 


M00031416D:H05 


CH08LNH 


189 


4747 


RT A000029 1 9F.C.23. 1 .P.Seq 


F 


M00033041A:B1I 


CH08LNH 


190 


24532 


RTA000029 1 9F.m. 1 6. 1 .P.Seq 


F 


M0003321SC:F07 


CHOSLNH 


191 


8576 


RTA00002S90F.h. 13. 1 .P.Seq 


F 


M0000I616D:F03 


CHOICOH 


192 


12056 


RT A00002 893F.g. 12. 1. P.Seq 


F 


M00003900C:D12 


CHOICOH 


193 


895 


RTA0000292 lF.b. 1 1.1. P.Seq 


F 


M00033303C:F09 


CH09LNL 


194 


7212 


RTA00002897F.j.04.1. P.Seq 


F 


M00004270A:E09 


CHOICOH 


195 


108296 


RTA00002907F.h.20. 1 .P.Seq 


F 


M00022193C:C09 


CH03MAH 


196 


115713 


RT A00002906F.a.22. i .P.Seq 


F 


M00021S52C:H02 


CH03MAH 


197 


7334 


RTA000029 10F.1. 08. i .P.Seq 


F 


M00023004C:A01 


CH03MAH 


198 


1090 


RTA0000291SF.g.20.2.P.Seq 


F 


M00032S92C:C1"2 1 


CHOSLNH 


199 


7913 


*RTA00002S86F.j. 13.1. P.Seq 


F 


M00001362A:F09 


CHOICOH 


200 


12139 


RTA00002923F.O.02. 1 .P.Seq 


F 


M00039349D:B 1 1 


CH09LNL 
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201 


17 148 


RT A0000°S95F 1 14 1 P Sea 


p 




PHD I fDH 


202 


23 14 


RT A0000^S9 IF * 10 1 P Sea 


F 


M0000I77 1 BE06 

1 » t vUVJ \J 1/(1 XJ . k—\J\J 


CH0ICOH 


203 


18660 


RTA0OO0°903F m 03 1 P Sea 


F 


1 M00007048C- AP 


CH0°COH 


204 


21799 


RT A0000°900F n 09 1 P Sea 


F 


M00005 3 8 5 D • F07 


CHO^COH 


205 


16612 


RTA0000 n 893F i 17 1 P Sea 


F 


M00003915C:D10 


CH01COH 


206 


168067 


RT A0000°909F a 17 1 P Sea 


F 


M00022537B:C06 


CH03MAH 


207 


21 197 


RTA0000289 1 F.c. 19. 1 .P.Seq 


F 


I M00001680A:A01 


CH01COH 


208 


45015 


RT A00002 905 F. 0.17.1. P.Seq 


F 


M00021678D:H04 


CH03MAH 


209 


15381 


RTA00002903F.m. 19.1 .P.Seq 


F 


M00007070C:C01 


CH02COH 


210 


1402 


RTA00002896F.i.04. 1 .P.Seq 


F 


M00004166C:B 10 


CH01COH 


211 


69026 


RTA00002893F.i\05. 1. P.Seq 


F 


M00003887C:E09 


CH01COH 


212 


33 119 


RTA00002922F.L 07.1. P.Seq 


F 


1 M00039067A:C05 


CH09LNL 


213 


1 166 


RTA0000289 lF.c.05. l.P.Seq 


F 


M00001677B:H08 


CH01COH 


214 


14345 


RTA00002S91F.b.06.1. P.Seq 


F 


M00001671C:F03 


CH01COH 


215 


10589 


RTA00002908F.d. 16. l.P.Seq 


F 


M00022392B:F01 


CH03MAH 


216 


13281 


RT A0000°916F o 01 1 P Sea 


F 


M00032652C:C07 


CHOSLxNH 


217 


13281 


RTA00002916F. 0.24. l.P.Seq 


F 


M00032652C:C07 


CH08LNH 


218 


12248 


RTA00002898F.h. 08. l.P.Seq 


F 


M00004351B:G07 


CH01COH 


219 


164955 


RTA00002909F.n.09. l.P.Seq 


F 


M00022706D:G08 


CH03MAH 


220 


15686 


RT A0000' 7 907F f ^4 1 P Sea 


F 


M00022170C:COl 


CH03MAH 


221 


1 ^6Rfi 


RTA0000°907F * 01 1 P Sea 


F 


MOOO^^ 170CC01 


CH03MAH 


222 




RT A0000° S9 1 F d 14 1 P Sea 


F 


M00001683B:F1 1 


CH01COH 


223 


6530 


RTA0000°900F d 04 1 P Sea 


F 


M 00005 409 D : B 02 


CH02COH 


224 


1964 


RTA00002903F.C. iO.l.P.Sea 


F 


M00006885A:F07 


CH02COH 


225 


10547 


RT A0O0O' ? S93F e 16 1 P Sea 


F 


M00003S84A:E12 


CH01COH 


226 


13393 


RT A0000290 1 F.a.06. 1 .P.Seq 


F 


M00005422B:B08 


CH02COH 


227 


14809 


RTA00002915F.e.08.2.P.Seq 


F 


M00028773C:C05 


CH08LNH 


228 


46850 


RTA00002907F.e.2 1 . 1 .P.Seq 


F 


M0002215SB:B09 


CH03MAH 


229 


5398 


RTA000029 i lF.i. 1 3. 1 .P.Seq 


F 


M00027004C:C1 1 


CH04MAL 


230 


27569 


RTA000029 10F.1. 14. 1 .P.Seq 


F 


M00023007D:D03 


CH03MAH 


231 


26277 


RTA00002S98F.2. 19. l.P.Seq 


F 


M00004347C:A05 


CH01COH 


232 


185914 


RT AOOOO^PF k 01 1 P Sea 


F 


M00027467A:C07 


CH04MAL 


233 


14274 


RTA00002895F.C. 09. l.P.Seq 


F 


M00004066D:G10 


CH01COH 


234 


28396 


RTA00002907F. g.02.1. P.Seq 


F 


M00022171A:F03 


CH03MAH 


235 




RTAOOO0' 7 916F k 07 1 P Sea 


F 


M00032614C.B LO 


CH0SLNH 


236 


6321 


RTA0000^908F b 17 1 P Sea 


F 


M00022372D:H12 


CH03MAH 


237 


21822 


RTA00002903F.a. 17. l.P.Seq 


F 


M00006861D:H10 


CH02COH 


238 


8440 


RTA00002923F.k. 09. l.P.Seq 


F 


M00039302B:E10 


CH09LNL 


239 


14677 


RTA00002905F.C. 05. l.P.Seq 


F 


M00007975D:F12 


CH03MAH 


240 


135005 


RTA00002902F.d. 23. l.P.Seq 


F 


M00006592A:A12 


CH02COH 


241 


509 1 


RTA00002S95F.n. 19. 1 .P.Seq 


F 


M000041 15A:G12 


CH01COH 


242 


24760 


RTA00002923F.h. IS. 1 .P.Seq 


F 


M00039270D:D02 


CH09LNL 


243 


21833 


RTA00002926F.t.22.2.P.Seq 


F 


M00040123C:AL0 


CH09LNL 


244 


12176 


RTA00002SS6F.2. IS. l.P.Seq 


F 


M00001353A:H07 


CH01COH 


245 


14407 


RTA0000290 lF.c.20. 1 .P.Seq 


F 


M00005452D:E05 


CH02COH 


246 


1S6319 


RTA000029 1 2F.d.24. 1 .P.Seq 


F 


M00027290C:F06 


CH04MAL 


247 


30135 


RTA00002907F.L2 L.2.P.Seq 


F 


M0002220SC:F08 


CH03MAH 


248 


33142 


RTA00002901F.1.01. l.P.Seq 


F 


M00005607B:C04 


CH02COH 


249 


33142 


RTA0000290lF.k.24. l.P.Seq 


F 


M00005607B:C04 


CH02COH 


250 


16232 


RTA00002900F.n. 10. l.P.Seq 


F 


M000053S7A:B03 


CH02COH 
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251 


23954 


RTA00002894F.k.06. 1 .P.Seq 


F 


M00004036B:All 


CH01COH 


252 


12399 


RTA000029l8F.n. 10. i. P.Seq 


F 


M00032985D:G09 


CH08LNH 


253 


30853 


RTA00002907F.J. 16.2. P.Seq 


F 


M00022216D:D10 


CH03MAH 


254 


8615 


RTA00002907F.1. 17.2. P.Seq 


F 


M00022238C:G04 


CH03MAH 


255 


142359 


RTA00002905F.C. 10. 1 .P.Seq 


F 


M0000798OB:AO7 


CH03MAH 


256 


9565 


RTA00002926F.g.08.2.P.Seq 


F 


M00040127C:D02 


CH09LNL 


257 


17334 


RT A00O02902F.d.03. 1 .P.Seq 


F 


M00006582D:A09 


CH02COH 


258 


12540 


RTA00002886F.f.03. 1. P.Seq 


F 


M00001349C.B04 


CH01COH 


259 


17289 


RTA00002926F.f. 14.2.P.Seq 


F 


M00040118D:C05 


CH09LNL 


260 


46798 


RTA00002907F.i.l3.2.P.Seq 


F 


M00022202C:C04 


CH03MAH 


261 


7797 


RT A00002905F.b.06. 1 .P.Seq 


F 


M00007953D:F07 


CH03MAH 


262 


1945 


RTA00002S96F.p.04.1. P.Seq 


F 


M00004201D:C03 


CH01COH 


263 


6084 


RTA00002896F.O.02.1. P.Seq 


F 


M00004195A:F07 


CH01COH 


264 


6091 


RTA00002930F.C.03.1. P.Seq 


F 


M00042915B:G11 


CH15C0N 


265 


186105 


RTA000O2930F.c.lO.l.P.Seq 


F 


M00055430A:A0l 


CH15C0N 


266 


11341 


RTA00002930F.h.07.1. P.Seq 


F 


M00055961C-.B10 


CH15C0N 


267 


2520 


RTA00002930F.e. 10. 1 .P.Seq 


F 


M00055639A:E06 


CH15C0N 


268 


136735 


RTA00002903F.k.06.1. P.Seq 


F 


M00007006C:C12 


CH02COH 


269 


8336 


RTA00OO2900F.e.20. 1 .P.Seq 


F 


M00004873B:G04 


CH02COH 


270 


13926 


RTA00002907F.H. 19.1. P.Seq 


F 


M00022193B:A09 


CH03MAH 


271 


11119 


RTA00002906F.k.0 1. 1. P.Seq 


F 


M00022009C:A08 


CH03MAH 


272 


11119 


RTA00002906FJ.24.1. P.Seq 


F 


M00022009C:A08 


CH03MAH 


273 


11726 


RTA00002906F.1.07. 1 .P.Seq 


F 


M00022051B:D07 


CH03MAH 


274 


6799 


RTA00002925F.g.2 1 . 1 .P. Seq 


F 


M00039885C:DL1 


CH09LNL 


275 


17266 


RTA00002389F.g.09.1. P.Seq 


F 


M00001544B:B05 


CH01COH 


276 


9479 


RTA00002924F.g.04. 1. P.Seq 


F 


M0003956OB:GO9 


CH09LNL 


277 


185557 


RTA00002912F.J. 13.1. P.Seq 


F 


M00027457B:E1L 


CH04MAL 


278 


27872 


RT A00OO2906F.e. 1 4. 1 .P.Seq 


F 


M00021923D:H02 


CH03MAH ! 


279 


15513 


RT A00002 9 24F. g . 2 1 . 1 .P.Seq 


F 


MO0O396I7C:AL0 


CH09LNL 


280 


4446 


RTA00002S91F.m.l5.l.P.Seq 


F 


M00003773A:F10 


CH01COH 


281 


1681 


RTA000029 i 6F.g.07. 1 .P.Seq 


F 


M00032577D:F0l 


CH08LNH 


282 


24243 


RTA00002887F.n. 13.1. P.Seq 


F 


M00001424D:D02 


CH01COH 


283 


16049 


RTA00002900F.C. 1 1. 1 .P.Seq 


F 


M00004846A:A10 


CH02COH 


284 


186267 


RT A000O29 1 OF. h. 1 1 . 1 .P.Seq 


F 


M00022924B:A05 


CH03MAH 


285 


4543 


RTA00002925F.h.22.l .P.Seq 


F 


M00039895D:C04 


CH09LNL 


286 


6176 


RT A00002 9I4F.d.23.l.P.Seq 


F 


M0002818SC:H11 


CH08LNH 


287 


29043 


RT A00002 906F. h . 1 7 . 1 . P . Seq 


F 


M00021974D:F01 


CH03MAH 


288 


696 


RTA00002922F.0. 15.1. P.Seq 


F 


M00039143A:F04 


CH09LNL 


289 


7225 


RTA00002S91F.1.22.l.P.Seq 


F 


M00003770C:A10 


CH01COH 


290 


25609 


RT A00002 S99F. h . 1 5 . 1 . P. Seq 


F 


M0000460SA:H04 


CHOiCOH 


291 


6295 


RTA00002922F.O.24. 1 .P.Seq 


F 


M00039L46B:G04 


CH09LNL 


292 


186319 


RT A000029 1 2F.e.0 1 . 1 . P.Seq 


F 


M00027290C:F06 


CH04MAL 


293 


4539 


RTA00002889F.d.04. 1. P.Seq 


F 


M00001534C:E07 


CHOICOH 


294 


17841 


RTA00002S9 lF.m.06. 1 .P.Seq 


F 


M00003771D:A03 


CHOiCOH 


295 


13720 


RTA00002924F.C.05. 1 .P.Seq 


F 


M00039430A:E04 


CH09LNL 


296 


7300 


RT A00002925F.a. 1 4. 1 .P.Seq 


F 


M0O0398O6B:DO5 


CH09LNL 


297 


186280 


RTA000029 1 2F.f. 1 3. 1 .P.Seq 


F 


M00027316C:C03 


CH04MAL 


298 


185585 


RTA000029 l2F.n.04. L. P.Seq 


F 


M00027569A:E05 


CH04MAL 


299 


3447 


RTA00002900F.1.1 1. 1. P.Seq 


F 


M00005364B:EiO 


CH02COH 


300 


14487 


RTA00002SS9F.f.l9.1. P.Seq 


F 


M00001542B:F09 


CHOICOH 



114 
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LIBRARY 


301 


5338 


RTA00002901F.2 16.1. P. Seq 


F 


M00005505A:F01 




302 


7766 


RTA000029 1 7F.e.07. 1 .P.Seq 


F 


M00032700A:E09 




303 


7450 


RTA00002899F.2.09. l.P.Seq 


F 


M00004509D:C06 


CH01COH i 

V— i ivy 1 »w V^/ 1 1 


304 


15369 


RTA00002908F.m. 1 8. 1 .P.Seq 


F 


M00022494D:A05 


CH03MAH 

n — a A y-S ~J It i rU i 


305 


4954 


RTA000029 19F.L 17. 1 .P.Seq 


i_ F 


M00033L50C:A11 


CH08LNH 


306 


17189 


RTA00002900F.j. 11.1 .P.Seq 


F 


M0OOO5333D:D08 


CH02COH 


307 


186561 


RTA000029 l2F.m.23. 1 .P.Seq 


F 


M00027549C:G03 


CH04MAL 


308 


44645 


RTA00002896F.h.22. l.P.Seq 


F 


M00004165C:A11 


CH01COH 


309 


11404 


RTA00002924F.a.24. l.P.Seq 


F 


M000394I3C:E06 


CH09LNL 


310 


38212 


RTA00002S93F.m.22. 1 .P.Seq 


F 


M00003942A:D0l 


CH01COH 

- A A \J A, V A A 


311 


22099 


RTA00002890F.m.09. 1 .P.Seq 


F 


M0000164SA:D10 


CH01COH 


312 


25041 


RTA00002S90F.p. 12. l.P.Seq 


F 


M00001661D:F06 


CH01COH 


313 


185938 


RTA0000291 IF.p.Ol. l.P.Seq 


F 


M00027173C:E11 


CH04MAL 


314 


9414 


RT A00002908F.O.06. 1 .P.Seq 


F 


M00022509B:Dll 


CH03MAH 


315 


185707 


RTA000029 1 1 F.o. 19. 1 .P.Seq 


F 


M00027167C:B10 


CH04MAL 


■ 316 


185499 


RTA00002912F.n. 19. 1 .P.Seq 


F 


M00027586A.C09 


CH04MAL 


317 


25704 


RTA000029 1 2F.n.22. 1 .P.Seq 


F 


M00027589B:G07 


CH04MAL 


318 


21068 


RT A00002 896F.h. 18. l.P.Seq 


F 


M00004164B:E12 


CH01COH 


319 


13440 


RT A000029 1 7F.e. 18.1 .P.Seq 


F 


M0003271 IB:F01 


CH08LNH 


320 


3907 


RTA00002923F.i. 1 S. 1 .P.Seq 


F 


M00039285B:G04 


CH09LNL 


321 


21391 


RTA00002S96F.g.03. l.P.Seq 


F 


M0000415SD:E0S 


CH01COH 

V*. I > — W 1 1 


322 


6755 


RTA000029 1 SF.LO 1 . 1 .P.Seq 


F 


M00032944A:B07 


CH08LNH 

^— ' A IvU A_# 1 ™ A A 


323 


155939 


RTA00002907F.j.23. 1 .P.Seq 


F 


M0002221SB:B12 


CH03MAH 


324 


8100 


RTA00002S96F.2.2L l.P.Seq 


F 


M00004160D:F06 


CH01COH 


325 


47 S5 


RTA000029 19F.J. 18.1 .P.Seq 


F 


M000331833:F10 


CH08LNH 


326 


14947 


RTA00002902F.k.23. l.P.Seq 


F 


M00006743A:D04 


CH02COH 


327 


8295 


RTA00002903F.k.23. 1 .P.Seq 


F 


M00007031A:E02 


CH02COH 


328 


156277 


RTA00002907F.L 13. l.P.Seq 


F 


M00022237D:D06 


CH03MAH 


329 


22751 


RTA00002S97F.1. 15. l.P.Seq 


F 


M000042S2C:AI2 


CH01COH 


330 


7869 


RTA000029 1 7F,j. 15. 1 .P.Seq 


F 


M00032749D:G03 


CH08LNH 


331 


156009 


RT A00002907F.k.05. 1 .P.Seq 


F 


M00022220A:A07 


CH03iVLAH 


332 


9453 


RT A00002907F.k.2 1 . 1 .P.Seq 


F 


M0002222SB:BI1 


CH03MAH 


333 


186052 


RTA000029 1 2F.h.08. 1 .P.Seq 


F 


M00027364B:E12 


CH04MAL 


334 


669 


RTA000029 1 7F.f.22. 1 .P.Seq 


F 


M00032723D:H02 


CH08LNH 


335 


11609 


RTA00002S99F.f.23. l.P.Seq 


F 


M000045 07 D:E 03 


CH01COH 


336 


186075 


RTA0000291 lF.k. 19. l.P.Seq 


F 


M00027057C:D10 


CH04MAL 


337 


935 [ 


RTA0000291 IF.L20.1. P.Seq 


F 


M00027081A:A0S 


CH04M.AL 


338 


11430 


RTA00002S92F.e.07.2.P.Seq 


F 


M00003S0SB:E0~ 


CH01COH 


339 


185938 


RT A000029 1 1 F.o.24. 1 .P.Seq 


F 


M00027173C:E11 


CH04M.AL 


340 


12394 


RT A000029 15F.m. 15.2.P.Seq 


F 


M00032497D:B iO 


CH08LNH 


341 


1S6588 


RTA000029 1 1 F. 1.03. 1 .P.Seq 


F 


M00027064B:D06 


CH04MAL 


342 


23174 


RTA00002909F.e. 17. 1 .P.Sea 


F 


M00022600D:B05 


CH03MAH 


343 


4727 


RTA00002905F.g. 19. l.P.Seq 


F 


M0000S059D:S08 


CH03MAH 


344 


17048 


RTA00002SS7F.1. 10. l.P.Seq 


F 


M00001416B:A05 


CH01COH 


345 


2354 


RT A 000029 1 6F.O.03. 1 .P.Seq 


F 


M00032645D.C01 


CHOSLNH 


346 


19S7 


RTA00002S94F.a. 13. l.P.Seq 


F 


M00003974D:E02 


CH01COH 


347 


244S3 


RTA00002S97F.i.21. l.P.Seq 


F 


M00004269A.G1 i 


CH01COH 


348 


33337 


RTA00002S96F.f.08. l.P.Seq | F 


M000041 55 A :K03 


CH01COH 


349 


11641 


RT A000029 1 6F.m. 19. 1 .P.Seq 


F 


M00032637A:F09 


CHOSLNH 


350 


10307 


RTA00002910F.1.01. l.P.Seq 


F 


M00022995C :G0" 


CH03MAH 
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351 


20388 


RTA00002906F.a.04. LP, Sea 


F 


M0OO2170OD:H03 


CH03MAH 


352 


24687 


RTA00002903F.m.02.I.P.Seq 


F 


M00007048B:EI 1 


CH02COH 


353 


10414 


RTA000029 1 9F. n. 1 9. 1 .P.Seq 


F 


M00033232B:C08 


CHOSLNH 


354 


11058 


RTA000O2892F.h. 16.2.P.Seq 


F 


M00003820B:F11 


CH01COH 


355 


6574 


RTA000029 1 7F.0. 17. 1 .P.Seq 


F 


M00032797D:D08 


CH08LNH 


356 


18782 


RTA00002905F.f.07.1. P.Seq 


F 


M0000802 1 C:G 12 


CH03MAH 


357 


35896 


RTA00002S96F.d.04. 1 .P.Seq 


F 


M00004146C:B04 


CH01COH 


358 


3518 


RTA00002930FJ. 10. 1 .P.Seq 


F 


M00056217D:E10 


CH15C0N 


359 


8820 


RTA000029 15F.f. 17.2.P.Scq 


F 


M00028782A:F01 


CH08LNH 


360 


10208 


RTA00002897F.h.08. l.P.Seq 


F 


M00004251D:D03 


CH01COH 


361 


2089 


RTA00002896F.g. 14. l.P.Seq 


F 


M00004159D:FI2 


CH01COH 


362 


170919 


RTA00002909F.p.03. 1 .P.Seq 


F 


M00022727A:G01 


CH03MAH 


363 


8727 


RTA000029 1 7F.O.02. 1 .P.Seq 


F 


M00032791B:H11 


CH08LNH 


364 


33184 


RTA000O2898F.d.08. l.P.Seq 


F 


M00004324A:D10 


CH01COH 


365 


27973 


RTA00002905F.g. 13. l.P.Seq 


F 


M00008055D:G03 


CH03MAH 


366 


15835 


RTA00002897F.k. 13. l.P.Seq 


F 


M00004278C:B10 


CH01COH 


367 


10273 


RTA00002903F.n.03. l.P.Seq 


F 


M00007081B:E09 


CH02COH 


368 


2832 


RTA00002899F.f.03. l.P.Seq 


F 


M00004502A:D12 


^CHOICOH 


369 


32022 


RTA00002903F.m. 12. 1 .P.Seq 


F 


M00007060D:G07 


CH02COH 


370 


68176 


RTA00002893F.2. 11.1 .P.Seq 


F 


M00003898C:A01 


CH01COH 


371 


29378 


RTA00002915F.n. 14.2. P.Seq 


F 


M00032508A:E03 


CH08LNH 


372 


23235 


RTA00002925F.k.02. 1 .P.Seq 


F 


M00039929B:E06 


CH09LNL 


373 


12111 


RTA00002895F.0. 17. 1 .P.Seq 


F 


M00004122C:D01 


CHOICOH 


374 


5737 


RTA00002924F.k.02. 1 .P.Seq 


F 


M00039672C:D05 


CH09LNL 


375 


72475 


RTA000029 15F.1. 15. l.P.Seq 


F' 


M00032490D:E0S 


CH08LNH 


376 


7027 


RTA00002907F.O.0 1 . 1 .P.Seq 


F 


M00022264A:B02 


CH03MAH 


377 


17165 


RTA00002903F.d. 1 9. i .P.Seq 


F 


M00006907A:C09 


CH02COH 


378 


26446 


RTA00002894F.m. 17. 1 .P.Seq 


F 


M00004047C:B09 


CHOICOH 


379 


6755 


RTA000029 lSF.k.24. 1 .P.Seq 


F 


MC0032944A:B07 


CHOSLNH 


380 


9336 


RTA00002909F.n.02. l.P.Seq 


F 


M00022703D:B 1 1 


CH03MAH 


381 


6960 


RTA000029 1 6F.O.0S. 1 .P.Seq 


F 


M00032647B:F06 


CHOSLNH 


382 


472 


RT A000029 1 1 F.g.0 1 . 1 . P.Seq 


F ; 


M00026936D:C07 


CH04xMAL 


383 


9460 


RTA0O0O29O8F.C.03. l.P.Seq 


F 


MO0022376D:DO5 


CH03MAH 


384 


10307 


RTA000029 10F.k.24. 1 .P.Seq 


F 


M00022995C:G07 


CH03MAH 


385 


4623 


RTA00002923F.d.22. 1 .P.Seq 


F 


M00039222B:A04 


CH09LNL 


386 


141167 


RTA00002905F.C.09. 1 .P.Seq 


F 


M00007980A:B01 


CH03MAH 


387 


34011 


RTA00002898F.m. 10. l.P.Seq 


F 


M00004385C:H12 


CHOICOH 


388 


5965 


RTA000029l5F.a.07. l.P.Seq 


F 


M00O28620C:C07 


CHOSLNH 


389 


12336 


RTA000029 15F.2.04. 1 .P.Seq 


F 


M000287S4A:D12 


CHOSLNH 


390 


36492 


RTA0O0O2S93F.US. l.P.Seq 


F 


M00003891B:H02 


CHOICOH 


391 


29803 


RTA00002908F.k.06. l.P.Seq 


F 


M00022467D:B03 


CH03MAH 


392 


4420 


RTA00002920F.a. 15.1 .P.Seq 


F 


M00033326B:B05 


CHOSLNH 


393 


15097 


RTA00002923F.b.06. 1 .P.Seq 


F 


M00039175A:F01 


CH09LNL 


394 


19133 


RTA00002S94F.g. 03. l.P.Seq 


F 


M00003993C:D07 


CHOICOH 


395 


9810 


RTA00002905F.C.03. 1 .P.Seq 


F 


M00007975C:A10 


CH03MAH 


396 


31562 


RTAO0OO2S97F.a.09. 1 .P.Seq 


F 


M00004210A:A03 


CHOICOH 


397 


1499 


RTA000029 1 2F.k. 12. l.P.Seq 


F 


M00027475D:A01 


CH04MAL 


398 


29531 


RTA00002907F.O.05. l.P.Seq 


F 


M00022265A:F11 


CH03MAH 


399 


4287 


•RTA000029 I SF.j .20. 1 . P.Seq 


F 


M0003292SC:D02 


CHOSLNH 


400 


2S660 


RTA00002905F.p. 1 1 - 1 .P.Seq 


F 


M00021690A:C03 


CH03MAH 
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401 


4596 


RTA0O0O2898F.f.2 1 . l.P.Seq 


F 


M00004341C:E05 


CHOICOH 


402 


21774 


RTAOO0O2898F.C.20. 1 .P.Seq 


F 


M00004322B:D03 


CHOICOH 


403 


5611 


RT A000029 1 5F.e. 1 2. i .P.Seq 


F 


M00028774D:E10 


CH08LNH 


404 


7030 


RTA00002894F.1 . 13.1 .P.Seq 


F 


M00004042B:A1 1 


CHOICOH 


405 


11736 


RTA00002S98F.e.09. 1 .P.Seq 


F 


M00004330A:A0i 


CHOICOH 


406 


94732 


RTA000029 lOF.e. 17. 1 .P.Seq 


F 


M00022856D:A07 


CH03MAH 


407 


30283 


RTA00002923 F. g. 19.1 .P.Seq 


F 


M00039 9 55DB01 


CH09LNL 


408 


129779 


RTA00002904F.a. 1 S. 1 .P.Seq 


F 


M00007155C:D07 


CH02COH 


409 


4635 


RTA00002900F. j .2 1 . 1 .P.Seq 


F 


M00005349C:C02 


CH02COH 


410 


5879 


RTA00002893F.f.08. l.P.Seq 


F 


M00003888B:F09 


CHOICOH 


411 


1 19206 


RTA00002905F.m. 16.1 .P.Seq 


F 


M00021650D:A1 1 


CH03MAH 


412 


6946 


RT A00002930F.g. 19.2.P.Seq 


F 


M00055880B:H10 


CH15C0N 


413 


42462 


RTA00002902F.f. 12. l.P.Seq 


F 


M00006631D:D02 


CH02COH 


414 


24285 


RTA00002896F.m. 17. 1 .P.Seq 


F 


M00004189A:C12 


CHOICOH 


415 


13769 


RTA0000290 1 F.a. 17.1 .P.Seq 


F 


M00005423C.D07 


CH02COH 


416 


17039 


RTA00002896F.i. 14. l.P.Seq 


F 


M00004169A:E04 


CHOICOH 


417 


14397 


RTA00002S96F.J. 1 i . 1 .P.Seq 


F 


M00004172D:B 12 


CHOICOH 


418 


14351 


RTA000028S8F.C.2 1. 1 .P.Seq 


1 F 


M00001444C:DI 1 


CHOICOH 


419 


5579 


RTA00002893F.j. 1 1. l.P.Seq 


F 


M00003914A:A08 


CHOICOH 


420 


24186 


RTA00002914F.n. 02. l.P.Seq 


F 


M00028366B:B08 


CHOSLNH 


421 


1 1433 


RTA0000292 1 F.c.06. 1 .P.Seq 


F 


M00033342B.F03 


CH09LNL 


422 


186635 


RTA000029 1 lF.f.06. 1 .P.Seq 


F 


M00026907D:E07 


CH04MAL 


423 


5955 


RTA000029 15F.d. IS. 1 .P.Seq 


F 


M00028771A:E02 


CHOSLNH 


424 


22053 


RTA00002S94F.k.09. l.P.Seq 


F 


M00004036D:C12 


CHOICOH 


425 


9259 


RTA000029 i 8F.b.09. 1 .P.Seq 


F 


M00032830D:D02 


CHOSLNH 


'426 


25437 


RTA00002905F.0. 23. l.P.Seq 


F 


M00021681C:C09 


CH03M.AH 


427 


8488 


RTA00002916F.L 02. l.P.Seq 


F 


M00032590B:H01 


CHOSLNH 


428 


4884 


RTA000029 19F.0. 12. 1 .P.Seq 


F 


M00033248D:HU 


CHOSLNH 


429 


9804 


RTA000029 15F.c. 19. 1 .P.Seq 


F 


M0002S764B:D03 


CHOSLNH 


430 


179954 


RTA000029 10F. j.04. 1 .P.Seq 


F 


M00022964A:B03 


CH03MAH ! 


431 


186532 


RT A000029 1 2F.a.0 1 . 1 .P.Seq 


F 


M00027189C:B 10 


CH04M.AL 


432 


11015 


RTA00002S94F.L 15. 1 .P.Seq 


F 


M00004029D:A01 


CH01COH 


433 


8824 


RTA00002903F.b. 17. l.P.Seq 


F 


M00006S77B:C09 


CHOICOH 


434 


4063 


RT A000029 1 6F.k.0 1 . 1 -P.Seq 


F 


M00032613A:E1 1 


CHOSLNH 


435 


7964 


RTA00002S96F.L IS. 1 .P.Seq 


F 


M00004170A:F03 


CHOICOH 


436 


9238 


RTA00002915FJ. 20. l.P.Seq 


F 


M00032473B:A03 


CHOSLNH 


437 


2841 


RTA000029 L4F.t\ 15. 1 .P.Seq 


F 


M00028196A:G03 


CHOSLNH 


438 


11203 


RTA00002S86F.p. 16. 1 .P.Seq 


F 


M000013S2D:H08 


CHOICOH 


439 


8800 


RTA00002SSSF.C. 20. l.P.Seq 


F 


M00001444B:E04 


CHOICOH 


440 


3224 


RTA000029 1 6F.d. 23. l.P.Seq 


F 


M00032556D: A03 


CHOSLNH 


441 


95423 


RTA00002909F.k.24. l.P.Seq 


F 


M00022674C:H08 | 


CH03MAH 


442 


7911 


RTA00002926F.C.1 l.2.P.Seq 


F 


M00040079D:D09 


CH09LNL 


443 


88052 


RTA00002925F.p. 11.1 .P.Seq 


F 


M00040041D:F01 


CH09LNL 


444 


32736 


RTA00002900F.1. 20. l.P.Seq 


F 


M00005367D:AU 


CHOICOH 


445 


20811 


RTA00002S96F.n. 14. l.P.Seq 


F 


M00004192C:B06 


CHOICOH 


446 


12856 


RTA0000290SF.b.07. l.P.Seq 


F 


M00022368A:311 


CH03MAH 


447 


12190 


RTA00002S99F.b. 10. 1 .P.Seq 


F 


M00004430B:310 


CHOICOH 


448 


10546 


RTA0000290 1 F.o.OS. 1 .P.Seq 


F 


M000056S9C:B0'2 


CHOICOH 


449 


21041 


RTA00002S98F.k.0S. l.P.Seq 


F 


M00004372A:E12 


CHOICOH | 


450 


164S4 


RTA00002S94F.C.04. l.P.Seq 


F 


M00003979B:A04 


CHOICOH 
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LFBR ARY 


451 


7741 


RTA00002894F.L08. l.P.Seq 


F 


M000040 0 8 R • F 1 f) 


rrrn i pnu 
LHUILUH 


452 


1492 1 


RTA00002926F.C. l5.2.P.Sec 


F 


M000400S ! C ■ FfV> 




453 


17571 


RTA00002900F. m. 16.1 .P.Sec 


F 


M00005^7sn- a i n 




454 


46881 


RTA0000290 1 F.1.20. 1 .P.Seq 


F 


M000056~^ A HO° 




455 


21533 


RTA00002898F.j.l0.1.P.Seq 


F 


\ M00004365C:C09 




456 


19010 


RTA000029 1 6F.k.08. 1 .P.Seq 


F 


M00032614D:D08 


("""HOST i\TW i 


457 


48768 


RTA00002886F.n.01.1.P.Seq 


F 


M00001374C:B 10 


CHOI COW 


458 


7515 


RTA00002892F.p.22.2.P.Seq 


F 


' M00003855C:F02 


CH01COH 


459 


17326 


RTA00002898F.h.02. l.P.Seq 


F 


M00004350A:A04 


CH01COH 


460 


3902 


RTA0000290 lF.d. 17. 1 .P.Seq 


F 


M00005460D:C11 


CH07POH 


461 


12400 


RTA0000290 1 F.d. 1 8. 1 .P.Seq 


F 


M0000546 1 A:D 12 




462 


186543 


RTA00002912F.a.06.1.P.Seg 


F 


M00027193C:A07 


CH04MA1 


463 


4063 


RTA00002916F.j.24. l.P.Seq 


F 


M00032613A:E1 1 


CHOSLiW 

v — l 1 vy <J i . i MI | 


464 


6267 


RTA00002910F.d.20.1. P.Seq 


F 


M00022835C:A09 


CHOiMAH 1 

^ — • 1 lu JlTl AJ 1 1 


465 


21349 


RTA00002901F.C.04. l.P.Seq 


F 


M00005445D:Fll 


chotoh i 


466 


L_ 1123 


RTA00002S94F.i.24. 1 .P.Seq 


F 


M00004031C:G06 


CH01COH 


467 


4401 


RTA0000291SF.m. 18. l.P.Seq 


F 


M00032979D:Cli 


CHOST NTH I 
v — • x i wo i_i > n | 


468 


15255 


RTA00002925F.p. 10. l.P.Seq 


F 


M00040041 A:G08 


CH09I NT 1 


469 


10991 


RTAOO0O2933F.a. 15. l.P.Seq 


F 


M00043077OG10 


n i _y v, wi ) 


470 


48768 


RTA00002SS6F.m.24. l.P.Seq 


F 


M00001374C:B10 


fHOirOH 


471 


20406 


RTA00002900F.C.20. l.P.Seq 


F 


M00004 8 5 2 D : C 0 6 




472 


39784 


RTA00002SS6F.2.05. l.P.Seq 


F 


M00001352B:B02 


f HDifOH 1 


473 


36567 


RTA00002 8 86F.n.06. l.P.Seq 


F 


M00001375B:D04 


CHOlfOH 


474 


14817 


RTA00002902F.a. 18.1 .P.Seq 


F 


M00005771D:C02 


LLKJ — v_ wn i 


475 


156277 


RTA00002907F.1.13.2.P.Seq 


F 


M00022237D:D06 


CHOlVfAH 


476 


6898 


RTA00002907F.a.22. 1 .P.Seq 


F 


M00022104A:G08 


CH03MAH 1 


477 


17376 


RTA00002902F.C.03. 1 .P.Seq 


F 


M00005819D:F09 


CHO^COH 1 


478 


186535 


RTA000029 1 2F.d. 12. 1 .P.Seq 


F 


M00027270A:D04 


CH04MAL 1 


479 


91616 


RTA000029 1 OF.b.24. 1 .P.Seq 


F 


M00022S12A:G01 


CH03MAH | 


480 


91616 


RTA000029 lOF.c.Ol. l.P.Seq 


F 


M00022S12A:G01 


CH03MAH 


481 


6993 


RTA00002S96F. j. 12. l.P.Seq 


F 


M00004172D:F04 


CH01COH ! 


482 


12443 


RT A000029 1 6F.a.20. 1 .P.Seq 


F 


M00032534B:E12 | 


CH0SLNH 

v — x, x v_/ \J i - .ill I 


483 


28585 


RTA0000290 lF.j. 16. 1 .P.Seq 


F 


M00005570A:D05 


CH0°COH 


484 


9453 


RTA00002907F.k.2 1 .2.P.Seq 


F 


M0002222SB.B11 


CH03MAH 1 

v — 1 1 vy ^> i v i x i 


485 


156009 


RTA00002907F.k.05.2.P.Seq 


F 


M00022220A.A07 


CH03MAH 

V_- 1 1UJ IVli ^1 1 I 


486 


5958 


RTA0O0O2908F.n.22.2. P.Seq 


F 


M00022507C:C08 


CH03MAH 

> — i, i. vy ^y i. ▼ i . vjl 1 j 


487 


155939 


RTAO00O2907F.j.23.2.P.Seq 


F 


M0002221SB:B12 


CH03MA/H 

V_. 1 1VJ J 1 T 1 , 1 1 


488 


16695 


RTA0OOO2SS6F.* 22. l.P.Seq 


F 


MOOOO 1 353D:E05 


CH01COH 


489 


10118 


RTA00002SS6F.h. 18. l.P.Seq 


F 


M00001356D:E06 


CH01COH 


490 


13288 


RTA0OO02930F.b.2 1 . l .P.Seq 


F 


M00042S91C:G08 


CH15CON 


491 


3210 


RTA00002910F.h.22. l.P.Seq 


F 


M00022945A:H09 


CHOnVTAH 


492 


15014 


RTA00002934F.a. 18. 1 .P.Seq 


F 


M0004352SA:E1 1 


CI-P0COHT V 


493 


22087 


RTA00002S90F.1. 19. l.P.Seq 


F 


M00001624A:C01 


CH01COH 


494 


3194S 


RTA000029G8F.L 12. l.P.Seq 


F 


M00022454C:B08 


CH03MAH 


495 


11593 


RTA00002906F.p.2 1.1. P.Seq 


F 


M00022094B:G02 


CH03MAH 


496 


3131 


RTA0000290SF.m. 17. 1 .P.Seq 


F 


M00022494B:D06 


CH03MAH 


497 


151263 


RT A00002906F.1.2 1 . 1 .P.Seq 


F 


M00021991D:F09 


CH03MAH 


498 


177542 


RTA000029 10F.h.23. 1 .P.Seq 


F 


M00022945B:Fll 


CH03MAH 


499 


9738 


RTA00002924F.f.23. l.P.Seq 


F 


M00039559B.C07 


CH09LNL 


500 


15313 


RTAOOO02925F.f.O5. l.P.Seq 


F 


M00039S65A:C09 


CH09LNL I 
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501 


19724 


RTA00002906F.p.05. 1 .P.Seq 


F 


M00022088D:E10 


CH03MAH 


502 


10731 


RTA00002S93F.m. 11.1 .P.Seq 


F 


iM00003938C:A05 


CH01COH 


503 


10257 


RTA0000290lF.h.09.1. P.Seq 


F 


M00005512B:H01 


CH02COH 


504 


186468 


RT A000029 1 3F.b. 18.1 .P.Seq 


F 


M00027746A:D06 


CH04MAL 


505 


14736 


RTA00002908F.g.22.1. P.Seq 


F 


M00022436C:F11 


CH03MAH 


506 


33267 


RTA000028S9F.h. 14. 1 .P.Seq 


F 


M00001548B:D06 


CH01COH 


507 


7719 


RTA00002908F.e. 11.1 .P.Seq 


F 


M00022403CE12 


CH03MAH 


508 


185539 


RTA000029 1 3F.b.03 . 1 .P.Seq 


F 


M00027717CC06 


CH04MAL 


509 


14825 


RTA00002924F.f. 19. 1 .P.Seq 


F 


M00039556CG05 


CH09LNL 


510 


3917 


RTA00002906F.p. 15. 1 .P.Seq 


F 


M00022092D:A1 1 


CH03MAH 


511 


18718 


RTA00002895F.h.05. 1. P.Seq 


F 


M00004085B:H02 


CH01COH 


512 


186762 


RTA000029 ISF.b.l 1. 1 .P.Seq 


F 


M00032831A:C07 


CH08LNH 


513 


2732 


RTA00002925F.i.07.1. P.Seq 


F 


M00039900B:G04 


CH09LNL 


514 


7684 


RTA00002924F.J. 17. 1 .P.Seq 


F 


M00039668C:F01 


CH09LNL 


. 515 


6852 


RTA00002922F.C.22. 1 .P.Seq 


F 


M00039001A:BI0 


CH09LNL 


516 


1422 


RTA00002924F.C.09. 1 .P.Seq 


F 


M00039433C:E03 


CH09LNL 


517 


5560 


RTAOO002917F.i.01.1.P.Seq 


F 


M00032734C:C05 


CH08LNH 


51S 


48734 


RTA00002908F.L23. 1 .P.Seq 


F 


M00022487C:C02 


CH03MAH 


519 


10486 


RTA00002899F.g.07. 1 .P.Seq 


F 


M00004509B:B10 


CH01COH 


520 


33514 


RTA00002S90F.j.03. 1 .P.Seq 


F 


M00001626A:D07 


CH01COH 


521 


5821 


RTA00002917F.m.01.1. P.Seq 


F 


M00032772D:D03 


CH08LNH 


522 


5821 


RTA000029l7F.1.24.1.P.Seq 


F 


M00032772D:D03 


CHOSLNH 


523 


21940 


RTA00002896F.a.03. 1 .P.Seq 


F 


M00004134A:A08 


CH01COH 


524 


185724 


RTA000029 12F.m.08. 1 .P.Seq 


F 


M00027523A:H05 


CH04MAL 


525 


182887 


RT A000029 1 0F.k.2 1 . 1 .P.Seq 


F 


M00022992A:H06 


CH03MAH 


526 


21346 


RTA00002901F.g.24.1. P.Seq 


F 


M00005507B:A03 


CH02COH 


527 


5501 


RTA00002887F.n.l2.l.P.Seq 


F 


M00001424B:H06 


CH01COH 


528 


13961 


RTA00002892F.j. 14. 1 .P.Seq 


F 


M00003828A:D11 


CH01COH 


529 


16784 


RT A00002886F.a.09. 1 .P.Seq 


F 


M00001338D:D01 


CK01COH 


530 


17628 


RT A000029 1 6F. f. 1 0. 1 . P . Seq 


F 


M0003256SB:F08 


CHOSLNH 


531 


3304 


RTA0O002898F.d.O5.1. P.Seq 


F 


M00004324A:B03 


CH01COH 


532 


14895 


RTA00002901F.g.l4.1. P.Seq 


F 


M00005504C:F12 


CH02COH 


533 


16036 


RTA00002S9 lF.k.09. 1 .P.Seq 


F 


M00003763B:B10 


CH01COH 


534 


23877 


RTA00002S9 IF.k. 15. 1 .P.Seq 


F 


M00003764B:F11 


CH01COH 


535 


186784 


RTA0OO0293OF.L 17. 1 .P.Seq 


F 


M00056105A:D06 


CH15C0N 


536 


13591 


RT A0000290 1 F. f. 1 5 . 1 . P . Seq 


F 


M00005485C:H04 | 


CH02COH 


537 


17916 i 


RTA0O002906F.p.08. 1 .P.Seq 


F 


M00022090B:A10 


CH03MAH 


538 


40594 


RTA00002S97F.i. 15. 1 .P.Seq 


F 


M00004266B:F07 


CH01COH 


539 


9677 


RT A00002925F.I.2 1.1. P. Seq 


F 


M00039915B:E08 


CH09LNL 


540 


7736 


RTA00002S87F.e.03.1. P.Seq 


F 


M00001393B:C03 


CH01COH 1 


541 


2474 


RTA00002917F.e. 15.1. P.Seq 


F 


M00032707D:F08 


CHOSLNH 


542 


23810 


RTA00002S92F.i.06. 1 .P.Seq 


F 


. M00003822D:A02 


CH01COH 


543 


24633 


RTA00002907F.L 19. 1 .P.Seq 


F 


M00022208B:D03 


CH03MAH 


544 


72081 


RTA00002925F.k.03.L P.Seq 


F 


M00039929D:H10 


CH09LNL 


545 


5991 


RTA00002916F.L 17.1. P.Seq 


F 


M00032597A:H02 


CHOSLNH 


546 


14596 


RT A000029 1 IF. n. 15. 1. P. Seq 


F 


M00027131A.-B03 


CH04MAL J 


547 


6923 


RTA0000:S96F.d.01.1.P.Seq 


F 


M00004146B:E08 


CH01COH 


548 


6923 


RTA00002S96F.C.24. 1 .P.Seq 


F 


M00004146B:E08 


CH01COH 


549 


21851 J 


RTA00002SS7F.cl.09. 1 .P.Seq 


F 


M00001391D:D03 


CH01COH 


550 


3935 


RTA00002925F.j.08. 1 .P.Seq 


F 


M00039921C:H11 


CH09LNL 



119 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


551 


13328 


RTA00002909F.h.08. l.P.Sea 


F 


M00022634B:H09 


CH03MAH 


552 


2492 


RTA00002SS7F.C. 11.1 .P.Seq 


F 


M00001393D:E02 


CH01COH 


553 


11960 


RTA000029 L7F.b.03. 1 .P.Seq 


F 


M00032671B:D08 


CH08LNH 


554 


186084 


RTA000029l2F.f. 18.1. P.Seq 


F 


M00027319D:F07 


CH04MAL 


555 


13644 


RTA00002925F.a.09. 1 .P.Seq 


F 


M00039805B:B06 


CH09LNL 


556 


5707 


RTA00002909F.k. 13.1. P.Seq 


F 


M00022672C:H04 


CH03MAH 


557 


95700 


RT A000029 1 1 F. p. 1 4. 1. P. Seq 


F 


M00027182B:G06 


CH04MAL 


558 


342 


RT A00002922F. i .23 . 1 .P .Seq 


F 


M00039076D:G04 


CH09LNL 


559 


8481 


RTA00002887F.C. 12. 1 .P.Seq 


F 


M00001389D:D06 


CH01COH [ 


560 


12575 


RTA000029 16F.1. 12. 1 .P.Seq 


F 


M00032594C:F05 


CH08LNH 


561 


40712 


RTA00002921F.d.08.1. P.Seq 


F 


M00033359C:H05 


CH09LNL 


562 


10768 


RTA00002SS6F.d.24. 1 .P.Seq 


F 


M00001346B:G11 


CH01COH 


563 


| 38781 


RTA00002889F.k.23.1.P.Seq 


F 


M00001559A:H09 


CH01COH 


564 


8790 


RTA00002888F. g.08. 1. P.Seq 


F 


M00001461D:C10 


CH01COH 


565 


10167 


RTA000029 16F.k.22: 1 .P.Seq 


F 


M00032621A:F11 


CH08LNH 


566 


13706 


RT A00002905F.e.2 1 . 1 .P.Seq 


F 


M000080I9B:A01 


CH03MAH 


567 


124172 


RTA00002900F.a.09. 1 .P.Seq 


F 


M00004824A:D12 


CH02COH 


568 


92126 


RTA000029 lOF.g. 12. 1 .P.Seq 


F 


M00022904C:D04 


CH03MAH 


569 


5830 


RTA000029 1 6F.J .09 . 1 .P.Seq 


F 


M00032605B:D09 


CH08LNH 


570 


15154 


RTA00002SS6F.p. 13.1. P.Seq 


F 


M00001382D:A07 


CH01COH 


571 


25813 


RTA000029 lOF.i. 12. 1 .P.Seq 


F 


M00022952A:B02 


CH03MAH 


572 


17268 


RTA00OO2886F.d.07. 1 .P.Seq 


F 


M00001344D:EOS 


CH01COH 


573 


13684 


RTA000029 15F. j.09. 1 .P.Seq^ 


F 


M000314S5B:G05 


CH08LNH 


574 


13460 


RTA00002898F.t. 19. 1 .P.Seq 


F 


M0000434IC:A09 


CH01COH 


575 


25115 


RTA000029 19F.p. 18.1 .P.Seq 


F 


MOOO3331 1B:G10 


CH08LNH 


576 


19949 


RTA00002905F.e. 17. 1 .P.Seq 


F 


M0000S016B:E09 


CH03MAH 


577 


24266 


RTA000029 l7F.k.06. 1 .P.Seq 


F 


M00032759A:A03 


CH08LNH 


578 


8243 


RTA0000290 lF.o. 17. 1 .P.Seq 


F 


M00005703B:E03 


CH02COH 


579 


• 12576 


RTA00002900F.k.23.1. P.Seq 


F 


M00005359B:S0S 


CH02COH 


580 


2S531 


RTA00002909F.C.04. 1 .P.Seq 


F 


M00022559D:G10 


CH03MAH 


581 


15153 


RTA00002S94F.O.2 1 . 1 .P.Seq 


F 


M00004054A:D03 


CHOtCOH 


582 


9498 


RTA00002S94F.e.04. 1 .P.Seq 


F 


M000039S5D:B02 


CH01COH I 


583 


48140 


RTA000029 14F.h. 13.1 .P.Seq 


F 


M0002S21 1A:F10 


CHOSLNH 


584 


7626 


RTA00002S95F.b.04. 1 .P.Seq 


F 


M00004061B:E05 


CHOICOH 


585 


22668 


RTA00002S96F.p. 17. 1 .P.Seq 


F 


M00004204C:H0S 


CH01COH 


586 


45691 


RTA00002908F.a. 11.1 .P.Seq 


F 


M00022305A:B04 


CH03MAH 


587 


30429 


RT A00002904F.a. 19.1 .P.Seq 


F 


M00007155D:C09 


CH02COH 


588 


46969 


RTA00002909F.g.02. 1. P.Seq 


F 


M0002261SC:E04 


CH03MAH 


589 


44030 


RTA00002900F.O.23. 1 .P.Seq 


F 


M00005405C:D01 


CH02COH 


590 


142548 


RTA00002905F.h. 10. 1 .P.Seq 


F 


M00008073A:D01 


CH03MAH 


591 


18455 


RTA00002905F.g. 1 S. 1 .P.Seq 


F 


M00008059B:F08 


CH03MAH 


592 


7501 


RTA00002S94F.g.05. 1 .P.Seq 


F 


M00003993D:B03 


CHOICOH 


593 


72S0 


RTA00002S93F.n.22.1. P.Seq 


F 


M00003959D:A05 


CHOICOH 


594 


19339 


RTA00002S98F.1. 12. i .P.Sea 


F 


M00004376D:AI2 


CHOICOH 


595 


30194 


RTA00002922F.k.05. LP. Seq 


F 


M00039L00A:G04 


CH09LNL 


596 


32650 


RT A000029 1 IF.L05. LP. Seq 


F 


M00026994D:D07 


CH04MAL 


597 


10510 


RTA00002905F.d. i 7. LP.Seq 


F 


M0000S001B:F05 


CH03MAH 


598 


13539 


RTA00002S98F.t*.03. LP.Seq 


F 


M00004336A:A01 


CHOICOH 


599 


20149 


RTA00002917F.0. 03. l.P.Sea 


F 


M00032791D:F01 


CHOSLNH 


600 


127S0 


RTA00002S9 lF.e.06. LP.Seq 


F 


M0000I6S6D:F06 


CHOICOH 
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x— 1 1 i ix . vi x i 


601 


182479 


RT AOOOO^ 10F i IS 1 P Sen 


p 
i 


m nno^ q ? ~* c ■ p n s 


runiM \ LI 
v-HUjivlAri 


602 


14016 


RT A0000^9°3F n 17 I P Sen 


X 


Monmo'iiir- a i i 


runor \rr 


603 


76075 


RT AOOOO">890F h ~>3 1 P Sen 

l\ 1 *k UWUU- w y vy 1 . 1 1 . . 1 . k . 1 . JCU 


p 


MDOOn 1 fi~>OR - ADi 

IV luUUU 1 vj WD . .AvJ 


T_rn i r^o u 
v«riUlL,vJrl 


604 


9806 


RT A 0000^9 T*>F t OS 1 P Sen 
ix x nui/uu-7— i . i . vj j . i .r. jcu 


p 






605 


9086 


RT AOO0O^889F k 1*5 I P Sen 


p 


MODDO 1 S S 3 A • F06 

1V1\j\jv^*J x _> J O A.EUO 


run i row 
v_ riu I v_ vjrl 


606 


2619 


RTA0000^907F o P 1 P Sen 


p 


MDOn^^^^QP AOJ. 


V— nuj ivirvri 


607 


175 17 


RT A0000" > 907F h 06 1 P Sea 


p 


MOOD^' 7 I S^ A RD^ 

I'lUWU-- 1 0_> . XJ VJ-7 


v_ rivj j i* i An 


608 


5089 


RTA000029 15F.e.22.2.P.Seq 


F 


M OOO'? 87 7 7 R • G04 

l»l vUU» U / / / 1_J . XJ ^J*T 


CH08LNH 

v_. llV/OLil" XT. 


609 


6728 


RTA00002904F.D. 13.1.P.Seq 


p 


M00007 178 A CO 0 


CHO^COH 

x^ iiv/».vv/n 


610 


41149 


RT A0000 9 899F ° ^0 1 P Sea 


p 


M 0000460" B • E0° 


CH01COH 

X- X IVJ 1 V<V1 X 


611 


35017 


RT A0000"> S9?F f 03 7 P Sea 


p 


M000038 PC - A03 

IV l\J\y\J\J y O 1 _ v. . . \.\J J 


CH01COH 

v_ l ivy i \ — vj i x 


612 


7008 


RTA0000°9°3F a 08 1 P Sea 


p 


M00039 1 6 ^ D • C04 

1VX\J\JVJ J 7 1 <J_/ 1— ' . V. — VJ*T 


CH09I NT 


613 


185545 


RTA0000°9PF k 16 1 P Sea 


F 


M000°7480C ■ E09 


CH04MAL 

^ — 1 1 vy " i v i 


614 


17840 


RT A000O?89°F d 15 "> P Sea 


F 


M00003S54B F07 

I'lWVvUJO J**1J .1 VJ / 


V 1 i VJ 1 V— VJ i 1 


615 


185914 


RTA0000°9 PF i ">4 I P Sea 


p 


M000' 7 7467 A C07 

i»l vjvjvJ / t / ."^ . V_, <J / 


CH04M Al 

Vw 1 I VJ *-T 1 V 1 1 


616 


6862 


RTA0000°903F b 08 1 P Sea 


p 


M0000687°n- R07 

IVlvjVJvJVJVJ o / _ Ly . U VJ / 


v_ livj— v_ vjn 


617 


20120 


RTAOO0O°888F c ?4 1 P Sea 


p 


M0000 1 44 > R • F06 

1 v l VJVJVJVJ x ' » _ U . X VJVJ 


CHOICOH 

v i ivj i \wVjn 


618 


20120 


RTAOOOO°8S8F d 01 1 P Sea 


p 


M0000 1 44 -F06 

iVl VJVJVJVJ 1 *T"T^ L_> . 1 VJVJ 


CHOI fOH 
v., i ivj i v — vj n 


619 


13879 


RT A0000°9" , 3F d 0° I P Sea 

i\ X VJ VJ vy \y — 7 — J 1 ,U>Uu. 1 • 1 . «JV- 


p 


M00039°07 A F07 

l*iV>vjvj J y 1 .x vj / 


CH09T NT 

V_ l IVJ 7Li>L 


620 


9330 


RT AOOOO"^ 15F * 16 1 P Sea 

1X1 * w UUvv/w y k y k IV- 1 ■! JvU 


F 


Mnon°87X^R • A04 

IV LVJ VJVJ — O / O VJ i-J . y\.\J- 


rHD8T MT4 
v nvjoi-i 1 1 1 


621 


21572 


RTA0000^9° IF h 19 1 P Sea 

ix 1 i\\JUvv/«.y _ IX .11. l y • l .1 . .JV^Cj 


p 


1VXVJVJVJ J J*+*+ i - i. - i_> 1 — 


V_, I IVJ y l^i N 


622 


2941 


RT A0000^9 19F h 1 P Sea 

IX 1 riUuUVJ— 7 1 y 7 ! .11- — —- 1 .1 .OV-Vi 


F 


MOOD W 1 A • T\CO 

iViVJVJvjj j 1 j~\ . LJ VJ — 


v_ i lvjo i_^> n 


623 


32154 


RT A0000°90' J F b 16 1 P Sea 

ix i . \uuu vy — y vy ^y i ,u. l w. i . i . v_ 


F 


M0nD07Q^Q D ■ CO 1 

i» iVJVJvjVJ / y \Jy Ly ■ V_ VJ i 


PKO^iVf AH 

V^. ilUJ iVl/AXl 


624 


20875 


RT A0000">90 1 F k 16 1 P Sen 

ix 1 . v UUUu — 7U 11 . JV . 1 VJ . L . 1 . JCU 


p 


iVl VJVJ VJVJ _J VJ VJ _/ O - XX VJ J 


V_ XI VJ— X- Vjil 


625 


186324 


RTA0000^9PFd 17 1 P Sea 

xx x . v y i — l .vi. i / • 1 .x . Jtu 


F 


Mnon^777_i A • ADQ 

IViVJVJVJ— / — / -rr\ . ^AVJ" 


PHOaxr at 


626 


10768 


RTAOOOO°886F e 01 1 P Sea 

XX X a ». w \J \J V/ . • KJ KJ KJ X . - V-/ X • X -X ■ JvU 


F 


M00001346BG1 I 

\y \y w i j^vjxj .v i i x 


CHOICOH i 

V_ X XVJ 1 V — VJ 1 1 


627 


16711 


RT A0000^935F m li 1 PSea 

X X A. a i Vy \_/ V-/ mS m~S m./ k ■in. 1 A - t * X . W. VJ 


F 


M00055 00 1C-H1 1 

1'iUUUJ J - — 1 V^. . X X X X 


CH17COHLV 

v — xxx/ > — \y i u. ■ T 


628 


14688 


RTA00002925F.n. 14.1. P. Seq 


p 


M0004007 3B B 10 


CH09LVL 

k ivy y j — . l i x^ 


629 


44419 


RTA00002907F.b. 19. 1 .P. Seq 


F 


M000221 1SA:E06 


CH03M.AH 


630 


12614 


RTA00002S96F.p.03. 1 .P.Seq 


F 


M00004°0 1 DC01 


CHOICOH 


631 


21658 


RTA00002902F.C.23. 1 .P.Seq 


F 


M00006576D -C0° 


CHOICOH 

x^- X k\-/ mm- v_ v_y X X 


632 


10150 


RT A0000" > 901F i 16 1 PSea 


F 


M00005540 A*F09 


CHOICOH 

x^ x ivy— v_, vy 1 1 


633 


185909 


RT A000029 1 2F.C.20. 1 .P.Seq 


F 


M000°7°6° A A07 


CH04M AL 


634 


14S93 


RTA0000 o S90F f 08 1 PSea 

XX X x X v_/ \J \J \J U / v_/ X »1 « vU< X-X .iJ^\J 


F 


M00001607D-H09 

i»ivjvjvjv_/ 1 vjvj / . i ivy 


CHOICOH 

x_* x xvy i v_- vy 1 1 


635 


32125 


RT A0000°903F c OS 1 P Sea 

xx x i x w w w \y — ✓ v./ x • w * w • x - x - ^ vj 


F 


M000068S4D- AOS 

ivnjvjvjvj'JvJ o ~ i ' . _ k.vj«J 


CHOICOH 

1 ivy — v — vy i i 


636 


1 1909 


RT A0000^90^F a 1 1 1 P Sea 

XX X * \V V V_/ Vy mmmt -S \J ^ X , LX . 1 1 • i . X >iJvU 


F 


M00005766D-D P 

i ▼ i w vy vy vy w> / vjvj i ' . i_y i — . 


CHOTOH 

Vw 1 — V_ VJ 1 X 


637 


17237 


RT A0000 9 901F I P I PSea 


F 


\f 000056 1 6B ^07 

i » 1 vy w W vy y \y ivj j_y . x vj ' 


CHOICOH 

V X IVJ — V VJ I L 


638 


1 1 14S 


RTA0000°900F i IS 1 PSea 


F 


M00005 >46D- A03 

IV IWVJVJVJ J J *"TVJ J — ' . . V VJ 


CHOICOH 

X— x iiy — v_ vy i i 


639 


14837 


RTA00002925F.n.20. 1. P.Seq 


F 


M000400 n 5 A B04 


CH09LNL 

v x xvy y i — . ^ \ — ■ 


640 


4343 


RTA00002S97F.1. 1 3. 1 .P.Seq 


F 


\I00004°S' :) B-D07 


CHOICOH 


641 


186S6 


RTA00002S9SF.J. 16. 1. P.Seq 


F 


M00004366DC 1 1 

ivy vy vy vy~— ' wvy i_y . \ i x 


CHOICOH 

v . x ivy k v — vy 1 1 


642 


10090 


RTA0000°S9 n F n 10 ^ P Sea 

XX X m. * \y V/VU mmmr KJ *S mm* X .il«Xw*^*X .iJVvU 


F 


\ 10000 " 84° D • H09 

i\y\y\y\jy <j " i~y . L\y y 


CHOICOH 

V_» I lu 1 X, VJ 1 1 


643 


612 


RTA00002SS9F.d. 13.2.P.Seq 


F 


M00001535B:E02 


CHOICOH 


644 


10752 


RTA00002S92F.n.06.2.P.Seq 


F 


M00003S42D:D11 


CHOICOH 


645 


167203 


RTA000029 14F.C. 14. 1 .P.Seq 


F 


M0002S070A.H09 


CHOSLNH 


646 


21269 


RTA00002901F.). 15.1. P.Seq 


F 


M00005570A:BOS 


CH02COH 


647 


186250 


RTA000029 1 0F.a.2 1 . 1 .P.Seq 


F 


M00022797D:A06 


CH03M.AH 


648 


24633 


RTA00002907F.i. 19.2. P.Seq 


F 


M0002220SB:D03 


CH03MAH 


649 


12295 


RTA000029 1 SF.c.02. 1 .P.Seq 


F 


M00032S36B:A07 


CHOSLNH 


650 


7870 


RTA00002905F.b.22. 1 .P.Seq 


F 


M00007973B:D1 i 


CH03NLAH 
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I I~RR AT? Y 


651 


1 999s 


RT A 0000^90° F ri OS 1 P <^n 


r 


\/f oonn a >q< a • piH7 

IV1UUUUO J o J .A . uu / 


CH02COH 


652 


7775 


RT A0000 o 89°F n P 9 p Q-r, 


p 


N/TOOOOl Q J.T A • f-jn l 


CHU1COH 


653 




RT ADOOO^Q^QF f ~> 1 1 P <^n 






C_H 14EDT 


654 


68i i 

uoj i 


RT AOOOO~ , Q97F h 9 I 1 P 


p 


IV1UUU jV4o J A. U iU 


LH 12bDT 




10718 
1 U / Jo 


RTA00009010F h OS 1 P 

t\ L .auuuu— .7 jur. d.uo. i .r.ocC] 


p 


^yf^^^y1T7^ i \ •r^riz 
lvIVukJ^J. /i4A:vjUo 




656 




RT AOnOO">Q19F n of) 1 P <\^n 
iv i .-vuuuu.-, y i .r . jcq 


p 


iviuuu^zy / :rU4 


Lh ioCON 




91 161 


RT An0009<2Q^p h 01 1 P 

tv l .AUUUu__oy jn.n.uj . l .r .ocq 


p 


JV1UUUU4U0 0 A . riU 1 


CHUICUH 


\J~> o 




RT A OOOO^Q^IF i !5 1 P <?,-n 


P 


iYlUUU jy«o4U.riU / 


/^tj/^ot xrr 
CrlUyLiNL 


yj^J y 




R X A OOOn9QO^P r» OS I P C^n 


p 


IVlUUUUoUU / o . tu J 


LiiUjMAH 


660 


I 11 1 7 


RT AOOOO">887F n 0 1 1 P xVn 


p 

r 


\yfnnnn i p^D^nn^ 

MUUUU i 4_ _ d . UUO 


/^Tjn i rnu 
CHUlCvJil 


661 


10656 


RT AOOOO n 006F 1 01 1 P <^n 
ix i .-\vjvjvjvj— 7\jur . i .u j . l .r. jtij 


p 


1V1UUU-: IK) j - A . LjUj 


L.MUj_V1Ait 


662 


7852 


RT A0000~>S89F p 14 1 P <\r-n 

IX I rVUvvu_uO/l .C 1~*. 1 .1 ,JCU 


p 


MOOOO 1 > 1 's R ■ A 07 


PT40 1 row 


663 


13217 


RT AOO0O^887F m 94 1 P Sen 


P 


X/fOOOO li'^'iR PiOA 
1VIUUUU 1 t 1 lj . JJUD 


run i ppif-T 
riu l LUn 


664 


15 152 


RTA0O00" , 9^"SF f ">4 1 P St»n 


F 
i 


M00019S71R-R04 


PROOT NTT 


665 


74141 


RT A0000 n 99^F n IS 1 P Sen 


p 


MOOOIQ i i^n-r 1 0 


VwXlUyi-iN l_ 




^jo / £ 


RT AOO00" l S9°F i 11 1 P Sen 

Ix 1 ,-\.UUUU-_o.7_r-I. 1 J. 1 - x . kjCLj 


p 


\/fnooni \ ha 

J.VIUUUU J o -id. A.U0 


run l rriu 


667 




RT A0000 n 906F cr 91 1 p Sen 


p 


1VIUUUZ Ib^O / U.rlUO 


v^nUJiviAxl ! 


uuo 


9575Q 


RT A0000"*907F m 10 1 POn 
ix i nuuuu-7W / r.rn. iu. i .r . j>cq 


p 
r 


IVlUUUZ-;-4y U.x^U 1 
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919 
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21107 
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CH16C0P 
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RTA00002931F.b.23. l.P.Seq 


F 
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RTA00002932F.a. 1 7. 1 .P.Seq 


F 
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CH18C0N 
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9484 


RTA00002934F.a. 13.1 .P.Seq 


F 
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CH20COHLV 
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RTA00002890F.j.2 1 . 1 .P.Seq 


F 


M00001633D:C11 


CH01COH 


928 


16061 


RTA00002932F.b.24. 1 .P.Seq 


F 


M00043113C:G09 


CH18C0N 


929 


4834 


RTA00002930F.d. 12. l.P.Seq 


F 


M00055527B:E01 


CH15C0N 


930 


9427 


RTA00002935F.C.21. l.P.Seq 


F 


M00054623C:F05 


CH17C0HLV 
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167736 


RTA00002935F.1. 11.1 .P.Seq 


F 


M000551 L7A:E02 


CH17C0HLV 


932 


16524 


RTA00002935F.b. 19. l.P.Seq 


F 
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CH17C0HLV 
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CH18C0N 
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F 
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RTA00002925F.p. 16. 1 .P.Seq 
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M00040045B:H07 


CH09LNL 
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CH08LNH 
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CH01COH 
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F 


M000334i3A:A08 


CH09LNL 


948 


16638 


RTA00002S99F.a. 15. l.P.Seq 


F 


M00004420D:E05 


CH01COH 


949 


8869 


RTA00002929F.a.05. l.P.Seq 


F 


M00039-4SC:G09 


CH14EDT 


950 


14426 


RTA000029 14F.C. 12. 1 .P.Seq 


F 


M0002S069D:H02 


CHOSLNH 



WO 01/02568 



PCT/US00/18374 



ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


951 


11994 


RTA00002S90F.h. 14. 1. P. Sea 


F 


M00001617C:F10 


CH01COH 


952 


186664 


RTA00002932F.a.05. 1 .P.Sea 


F 


M00042585D:D03 


CH1SCON 


953 


162235 


RTA00002907F.J.06.2. P.Seq 


F 


M00022212D:G02 


CH03MAH 


954 


2127 


RTA00002912F.o.l4.1.P.Seq 


F 


M000276053:D09 


CH04MAL 


955 


41014 


RTA00002901F.n.04.l.P.Seq 


F 


M0000564IB:E09 


CH02COH 


956 


17636 


RTA00002933F.C. 19. 1 .P.Seq 


F 


M00043222C:B06 


CH19COP 


957 


2328 


RTA00002935F.e.05. 1 .P.Seq 


F 


M00054579A:C02 


CH17COHLV 


958 


15414 


RTA00002935F.p.l3.1.P.Seq 


F 


' M00055423C.H10 


CH17COHLV 


959 


11948 


RT A00002 895F.O.0 1 . 1 .P.Seq 


F 


M00004118C:D12 


CH01COH 


960 


t 24759 


RTA00002903F.n.05.1. P.Seq 


F 


M00007082D:E05 


CH02COH 


961 


15152 


RTA00002925F.g.0 1.1. P.Seq 


F 


M00039873B:H04 


CH09LNL 


962 


14917 


RTA00002922F.b.02. 1 .P.Seq 


F 


M00038616D:B07 


CH09LNL 


963 


12941 


RTA00002S89F.C. 15.1. P.Seq 


F 


M00001532A:G08 


CH01COH 


964 


29676 


RTA0000293 lF.b.03. 1 .P.Seq 


F 


M00042788A:F04 


CH16C0P 


965 


17789 


RTA00002891F.a.21.1-.P.Seq 


F 


M00001671A:H10 


CHOICOH 


966 


45097 


RTA00002928F.g.06.1. P.Seq 


F 


M00040247D:D02 


CH13EDT 


967 


18407 


RTA00002909F.b. 11.1 .P.Seq 


F 


M00022546B:E05 


CH03MAH 


968 


22309 


RTA00002900F.n. 19.1. P.Sea 


F 


M00005392A:G06 


CH02COH 


969 


I 109382 


RTA00002907F.k. 1 3. 1 .P.Seq 


F 


M00022224A:G07 


CH03MAH 


970 


92273 


RTA00002909F.J. 17.1. P.Seq 


F 


M00022662D:H03 


CH03MAH 


971 


8403 


RTA000029 15F.J. 22. 1 .P.Seq 


F 


M00032474A:G03 


CH08LNH 


972 


7763 


RTA00002928F.h. 10. 1. P.Seq 


F 


M00040267D:A12 


CH13EDT 


973 


13470 


RTA00002930F.k.09.1. P.Seq 


F 


M00056304A:H05 


CH15C0N 


974 


1484 


RTA0000292 iF.k. 10. 1 .P.Sea 


F 


M00033556D:C10 


CH09LNL 


975 


10345 


RTA00002S92F.0. 19.2.P.Seq 


F 


M00003848C:G09 


CHOICOH 


976 


17242 


RTA0000293 lF.a.05. 1 .P.Seq 


F 


M00042433A:E11 


CH16COP 


977 


171180 


RTA00002909F.f.24. 1 .P.Seq 


F 


M00022618B:D09 


CH03MAH 


978 


16790 


RTA000029 14F.C.03. 1 .P.Seq 


F 


M00028067A:Cli 


CH08LNH 


979 


139516 


RTA00002903F.1.02.1.P.Seq 


F 


M00007032C:A12 


CH02COH 


980 


4825 


RTA00002930F.b. 15. 1 .P.Seq 


F 


M00042742B:E04 


CH15C0N 


981 


8830 


RTA00002930F.a.05.1. P.Seq 


F 


M00042528C:H01 


CH15C0N 


982 


12398 


RTA00002935F.0. 19. 1 .P.Seq 


F 


M00055391B:C07 


CH17C0HLV 


983 


17867 


RTA00002900F.C. 14.1. P.Seq 


F 


M00004850A:B02 


CH02COH 


984 


15796 


RTA00002935F.b. 12. 1 .P.Seq 


F 


M00043339C:FU 


CH17COHLV 


985 


185669 


RT A00002935F. f . 1 3 . 1 .P.Seq 


F 


M000546S6A:A09 


CH17C0HLV 


986 


13638 


RTA00002935FJ.20. 1 .P.Seq 


F 


M00055002B:E0S 


CH17C0HLV 


987 


8280 


RTA00002930F.e. 12.1 .P.Seq 


F 


M00055653C:B07 


CH15C0N 


988 


12632 • 


RTA0000293 lF.c.03. 1 .P.Seq 


F 


M00042S60B:C07 


CH16C0P 


989 


7620 


RTA00002935F.m. IS. 1 .P.Seq 


F 


M00055240A:A08 


CH17COHLV 


990 


23922 


RTA0O002935F.m.20. 1 .P.Seq 


F 


M00055244B:F07 


CH17C0HLV 


991 


43864 


RTA0000293 1 F.b.05. 1 .P.Seq 


F 


M00042794A:F01 


CH16C0P 


992 


34478 


RTA00002929F.2. 13.1 .P.Seq 


F 


M00040367A:C08 


CH14EDT 


993 


6861 


RTA00002933F.C. 17.1. P.Seq 


F 


M00043221D:C12 


CH19C0P 


994 


13971 


RTA00002933F.b.0 1 . 1 .P.Seq 


F 


M00043099A:H04 


CH19C0P 


995 


13971 


RTA00002933F.a.24. 1. P.Sea 


F 


M00043099A:H04 


CH19C0P 


996 


13244 


RTA00002927F.e.0S. 1 .P.Seq 


F 


M00039537A:F0S 


CH12EDT 


997 


7455 


RTA00002935F.d. 1 1. 1 .P.Seq 


F 


M0OO54S2SB:E05 


CH17COHLV 


998 


18915 


RT A0OO02929F.b.2 1 . 1 .P.Seq 


F 


M00040201A:H01 


CH14EDT 


999 


4023 


RTAOOO02935F.h.03. l.P.Seq 


F 


M000547S6C:D0S 


CH17C0HLV 


1000 


10785 


RTA00002933F.a. 11.1 .P.Sea 


F 


M00043074C:D07 


CH19C0P 
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1001 


14851 


RTA0000292SF.i.07. 1 .P.Seq 


F 


M000402S7C:FIO 


CH13EDT 


1002 


109382 


RTA00002907F.k. 13.2.P.Seq 


F 


M00022224A:G07 


CH03MAH 


1003 


23878 


RTA00002933F.b.03.1.P,Seq 


F 


M00043101D:Gll 


CH19COP 


1004 


27516 


RTA00002927F.L 17. 1 .P.Seq 


F 


M00039598A:E04 


CH12EDT 


1005 


9652 


RTA00002931F.C. 04.1. P.Seq 


F 


M00042863D:F09 


CH16COP 


1006 


24729 


RTA0000293 1 F.a. 12. 1 .P.Seq 


F 


M00042462B:C02 


CH16COP 


1007 


186041 


RTA00002912F.b.24.1. P.Seq 


F 


M00027244C:B06 


CH04MAL 


1008 


12282 


RTA00002935F.i. 18.1. P.Seq 


F 


M0005490SC:A01 


CH17COHLV 


1009 


10704 


RTA00002930F.f. 12. 1 .P.Seq 


F 


M00055757A:B01 


CH15CON 


1010 


3397 


RTA00002930F.h.08. 1. P.Seq 


F 


M000559oB:F09 


CH15CON 


101 1 


35256 


RTA00002886F.m. 16. 1 .P.Seq 


F 


M00001374A:B02 


CH01COH 


1012 


1448 


RTA00002900F.g. 05.1. P.Seq 


F 


M00005002A:C03 


CH02COH 


1013 


1259 


RTA00002922F.n.08. 1 .P.Seq 


F 


M00039131CB09 


CH09LNL 


1014 


16903 


RTA00002935F.a.01.1. P.Seq 


F 


M00042352B:A04 


CH17COHLV 


1015 


7884 


RTA00002922F.il. 1 3. 1 .P.Seq 


F 


M00038390B.F02 


CH09LNL 


1016 


5976 


RTA00002930F.j. 1 1 . 1 .P.Seq 


F 


M00056220D:G02 


CH15CON 


1(117 


16832 


RTA00002888F.h. 20.1. P.Seq 


F 


M00001466B:F03 


CH01COH 


1018 


10490 


RT A000028 86F. k.07 . 1 . P. Seq 


F 


M00001364C.H10 


CH01COH 


1019 


6317 


RTA00002928F.2.10. 1. P.Seq 


F 


M00040252C:G05 


CH13EDT 


1020 


41215 


RT A00002897F.1. 1 1 . 1 .P.Seq 


F 


M000042S2A:D01 


CH01COH 


1021 


6844 


RTA00002889F.I.2 1 . 1 .P.Seq 


F 


M00001562B:B02 


CH01COH 


1022 


10456 


RTA00002897F.f.20.1. P.Seq 


F 


M00004242D:H01 


CH01COH 


1023 


2720 


RTA00002891F.i.07.1.P.Seq 


F 


M00003753A.C11 


CH01COH 


1024 


12473 


RTA0000288SF.o.07.1.P.Seq 


F 


M0000149~C:F10 


CH01COH 


1025 


15840 


RTA00002919F.n. 17.1. P.Seq 


F 


M0003323CC:G10 


CHOSLNH 


1026 


6554 


RTA00002895F.b. 16. 1 .P.Seq 


F 


M00004062D:A02 


CH01COH 


1027 


7330 


RTA0000291SF.n.l7.i.P.Seq 


F 


M000329S7B:F01 


CHOSLNH 


1028 


2206 


RTA000029 19F.f. 12. 1 .P.Seq 


F 


M0O033071C:GO5 


CH08LNH 


1029 


42705 


RTA00002935F.f.02.1. P.Seq 


F 


M00054643D:F07 


CH17COHLV 


1030 


33865 


RTA00002930F.b.07.1. P.Seq 


F 


M00042722C:C09 


CH15CON 


103 1 


5196 


RTA00002925F.0. 19. 1 .P.Seq 


F 


M00040034B:G02 


CH09LNL 


1032 


8087 


RTA00002935F.0. 10. 1 .P.Seq 


F 


M00055375C:F12 


CH17COHLV 


f r\-~t -l 

10jj> 


20072 


RTA00002935F.g.06. 1. P.Seq 


F 


M00054744C:F12 


CH17COHLV 


1034 


12797 


RTA00002935F.g.24. LP. Seq 


F 


M00054781D:Ali 


CH17COHLV 


1035 


3207 


RTA00002930F.C.04. 1 .P.Seq 


F 


M00054793B:A06 


CH15CON 


1036 


19600 


RTA00002929F.f.24.1. P.Seq 


F 


M0004035iD:G07 


CH14EDT 


1037 


6278 


RTA00002935F,j. 18.1. P.Seq 


F 


M00055001C:G10 


CH17COHLV 


1038 


19363 


RTA00002927F.e.l2.i.P.Seq 


F 


M0003956-iD:D04 


CH12EDT 


1039 


15447 


RTA00002929F.2.12. LP. Seq 


F 


M00040366B:H10 


CH14EDT 


1040 


9676 


RTA00002932F.a. 14. 1 .P.Seq 


F 


M00042960B:C06 


CH18CON 


1041 


12560 


RTA00002929F.h.06. 1 .P.Seq 


F 


M000403S1 A:B06 


CH14EDT 


1042 


12727 


RT A00002933F.C. 15. t .P.Seq 


F 


M00043219C.C02 


CH19COP 




Z / -f / D 


dt \ nnnn7o i lp ia i d 

K 1 .-\UUUU-i J l4r.L. ID. 1 .r.^eq 


r 


MUU02S0 /0D:C0j 


/^TTAOT \-TT 

CHOSLNH 


1044 


30646 


RT A0000290SF.L 11.1 .P.Seq 


F 


M00022416D:D01 


CH03MAH 


1045 


455S5 


RTA00002925F.h.20. 1. P.Seq 


F 


M00039S94C:D09 


CH09LNL 


. 1046 


25025 


RTA00002925F.e. 18. 1 .P.Seq 


F 


M00039S60B:E01 


CH09LNL 


1047 


15715 


RTA000029l9F.p.05.1.P.Seq 


F 


M00033274D:F03 


CHOSLNH 


1048 


33 1S5 


RTA00002926F.c.07.2.P.Seq 


F 


M000400~SA:C07 


CH09LNL 


1049 


83S4 


RTA00002903F.0. 13. LP. Seq 


F 


M000071 12D:D03 1 


CH02COH 


1050 


8S43 


RT A000029 1 7F.h. 17. 1 .P.Seq 


F 


M00032733B:F12 


CHOSLNH 
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SEQ 
rn 


*wLUo L tK 




AD rCMTATTHM 




r rn d \ d v 


105 1 


21401 


RTA000029_>0F.e.20.2.P.Seq 


F 


M000^^6 / 6 A:G02 


CH15C0N 


1052 


14434 


RTA0000290jF.1.0 1. 1 .P.Seq 


F 


lk /AAAATA^ Oj 4 T-> t~\ £ 

M000070j2A:B05 


CH02COH ! 


1053 


40045 


RTA0000289 lF.k. 14. 1. P.Seq 


F 


[VIUUOOj /d4A:H09 


CH0 1COH 


1054 


21853 


RTA00002S96F.g. 12. 1. P.Seq 


F 


M00004 i ^9C : D 1 0 


T TA 1 /n/»\TT 

CH0 ICOH 


1055 


23439 


RTA000029 jDF.p.0 1 . 1 .P.Seq 


F 


M0O0O D402 A : HO 1 


CH17COHLV 


1056 


13060 


RTA000029_?4F.a. 19. 1. P.Seq 


F 


\.jtr\r\r\ t ^ * t~> r\ o 

M0004j^29A:B08 


/""TTOAAAITT XT 

CH20COHLV 


1057 


23439 


n. or» a /-vaaaa a ~> c t~* _ ^ . i t T"* A 

RTA000029 jdF.o.24. 1 .P.Seq 


F 


XiAAAiTC 1 AO \ T TA 1 

MOOO^ 3402 A : HO 1 


CH17CUHLV 


1058 


20547 


RTA000029j lF.b. 12. 1. P.Seq 


F 


x xr\r\r\ noil \ tta,i 

M00042822A:H04 


CHloCUP 


1059 


4319 


RTA000029 jOF.a.O j. 1 .P.Seq 


F 


* i€ t\f\r\ 1 »^ ^ Oj ^ T~» T T/"\ 1 

M000423 2^ B : HO 1 


/— < T T 1 — A AM 

IHI^IUIN 


1060 


21430 


n m * av A\ a* a\ a\ /™v t f * 1 O 1 l~\ 

RT A0000290 1 F.c. 1 8. 1 .P.Seq 


F 


M0000543 2 B : GO j 


T TAA (~^f~\X T 

CH02CUH 


1061 


7668 


RTA000029j>^F.g.2 j. 1 .P.Seq 


F 


M00034/S I B:H04 


/ — ' t t 1 1 r\ i_jt \t 
In 17CUHLV 


1062 


16239 


RTA000029joF.k. 19. 1 .P.Seq 


F 


MOOO^^Uo I A: Alto 


Tj 1 \ / 
Lnl /UUrUL V 


1063 


5631 


RTA00002929F.e. 16. 1. P.Seq 


F 


Ik f AAA imn/CD / • 

M000403 _ o B : OUv 


Ani 1 TT rYT 

L.H 14cL> I 


1064 


18362 


RTA00002928r .a.U4. 1 .P.Seq 


r 


\/fnnni07'J0D . u i n 
MUUUjv / jvd .rl 


v^.rl l je,jj l 


1065 


8034 


R T A000029j2r.a.22. l.r.Seq 


r 


x/r\AA onoin. \ i r\ 




1066 


12497 


r» t* \ aaaaoaooc i o i ra o 

RTAU0U0292or .a. iy. 1 .P.Seq 


r 


IV1UUU4U I J - A . riUV 


n T-T 1 's p pit 


1067 


21001 


RTA000O29j2F.d.O7. 1 .P.Seq 


r 


\taaa lOfinAD .UHQ 


T_T 1 OAA\: 


1068 


471 


RTA00002927F.a. 1 1. 1. P.Seq 


F 


M00O j9 1 c>4 D : H(Jy 


t_r 1 0 tr ht 
CrilzcUl 


1069 


10003 


RTA00002S97F.b. 1^.1 .P.Seq 


F 


M000042 1 3 B : CO^ 


Z^ 1 T r A 1 AATT 


1070 


16074 


RTA0000293^F.f. 18.1 .P.Seq 


F 


M000^4708C:B06 


IH 17LUHLV 


1071 


13698 


RTA00002902F.1.0 1.1. P.Seq 


F 


M0000674j A: HI 1 


CH02COH 


1072 


24819 


RTA00002922F. j.03 . 1 .P.Seq 


F ! 


M00Oj9O/SB:BOj> 


/~* T TA A T 4.TT 


1073 


21511 


RTA00002892F.1.0 1 . 1 .P.Seq 


F 


M0000 j S2 1 C : E 1 2 


Aon 1 /^/^\tj 


1074 


12402 


RTA00002929F.d. Id . 1 .P.Seq 


F 


M00040j 1 4B :D0 / 


AU1 1 rr r*\T 


1075 


142755 


RTA0000290j>F.p.06. 1 .P.Seq 


r 


M0OOO/ LoA:AU- 




1076 


3010 


RT A000029 jOF.p. 12.1 .P.Seq 


F 


4 (AAA C C 1 A ( — 1 . ( ' 1 ~) 


L,ril / LUrlL V 


1077 


17173 


RT A000029j5F.k. 09. 1. P.Seq 


r; 

r 


"4 .fAAAC CA 1 ^ T) . T TAP 

MUUlto3lJ4 j b :rlUo 


T_r 1 nnrwx\ \i 


1078 


2969 


RTA000029 j3F.a. 12. 1 .P.Seq 


F 


4 (AAA • A ~" ^ T~\ . \ AA 

M0004_>U / 6U: AU-i 


e^TJ 1 A/ - 'AND 


1079 


19600 


RT A00002929F.g.0 1 . 1 .P.Seq 


F 


4 (AAA < A "* C i . A*7 


Crll4t:iJi 


1080 


8542 


RT A00002927F.h. 24. 1. P.Seq 


F 


4 f A A A ~> A / I """ 4 . 4 A A 

^1000^964 A: AD I 


ATI 1 n PP»X 


1081 


24795 


RTA00002927F.f. 10. 1. P.Seq 


f i 


M000 j 9 3 9 4C : B 06 


a tj 1 a r^T 


1082 


19695 


RTA00002927F.i.03. 1 .P.Seq 


F 


H fAAAO/-\/~ | *" n \ A A 

M000j9O4 B:A02 


a 0 1 a p"r>nr 


1083 


8542 


RTA0000292 /F. l. 01. 1. P.Seq 


F 


4 ,f AAA ^ A i — \ . \ A A 

M000j964 A:AU2 


CHI -tLiJ 1 


1084 


21409 


RTA00002902F.e.09. 1 .P.Seq 


F 


4 fAAAA/1 Z~ A 1 . A -C 

M0000660 1 D:olto 


punorTMi 


1085 


186318 


RTA000029 1 2F.n.O / . 1 .P.Seq 


r 


A if AAA "! "7 o y- -4 t-v ,(~"C\ \ 

MUUU2 / j>dj«D:IjU4 


run/iM A T 
l nU4ivi/vL 


1086 


7379 


T-| <-p 4 AAAA1 AA 1 T7 _ 1/> 1 T~> O 

RTA00002901r.e. 10. 1 .P.Seq 


r 


TV/fAAAA^ 1 ,C 0 7~> . AA 1 

MUUUU34D^ U .IU 1 




1087 


91285 


RTA00002909F.L 12. 1. P.Seq 


F 


M0002 2 o 6 _ u : A i U 


a ir ai \ ,r \ tj 


1088 


3285 


RTA0000290jF.m. 1 S. 1 .P.Seq 


r 


4 fAAAAlA/, 1 T~A . "P^ 1 ") 

MOOOO /004D:1J l_i 


InUilUn 


1089 


6284 


RT A00002b96F.g. 22. 1. P.Seq 


F 
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M00004 I oOD :Olto 
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1090 


15676 
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F 
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att 1 7PALTI \/ 


1091 


34112 
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F 


4 f A A A A 1 A J "> T^AA 

M000040^ D : r 09 


A TJ A | AATT 


1092 


16407 


RTA00002S92F. o.24. 2. P.Seq 


F ! 


MOOOO i3:A01 


A I f A 1 AATJ 


1093 


919 


RTA00002S90F.f. 18.1 .P.Seq 


F 
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ATT A I f ALT 
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r 
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CH01COH 


1095 


31167 


RTA00002900F.f.05.i. P.Seq 


F 


M00004S7oB:A06 


CH02COH 


1096 


23873 


RTA00002930F.i.03.1. P.Seq 


F 


M00056O35D:AO8 


CH15CON 


1097 


15679 


RTA00002900F.h. 1 2. 1 .P.Seq 


F 


M000050lcC:E04 


CH02COH 


109S 


31S52 


RTA000029 1 lF.o.22 . 1 .P.Seq 


F 


M0OO27rOD:CO7 


CH04MAL 


1099 


39030 


RTA00002 S9 IF. f.07. 1. P.Seq 


F 


M00001692C:C04 


CH01COH 


1100 


16407 


RTA00002S92F.p.01.2.P.Seq 
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ATT 1 CAA \f 


1 O 1 o 
12 1 0 


25 54 J 


Fi T \ AAAAO AACC ~ AC t D C 

R 1 A0UU0290or .0.05 . 1 .P.Seq 


r 


X ,f AAAOO CAO > .ILTAO 

M0002 2509 A . HU2 




12 19 


1794 


T"l 0~ s AAAAono » n^ 1 A C — 

RTA00002924F.g.06. l.P.Seq 


r 


X -rAAA^A^" tl r~\ / — ' /""■ A £L 

M000^9560C:G06 


CH09LNL. 


122Q 


A A A "i n 

22038 


RTA00002S96F.a. 20. l.P.Seq 


F 


% € AAA A « 1 "IS' /**" I~> 1 

M000041 j6C:B 12 


CH01COH 


1221 


601 1 


RTA00002924F.1. 12. l.P.Seq 


F 


M000394/SC:BO_ 


/— < T TA A T VTT 

CH09LNL 


1222 


41087 


RTA00002901F.0. 06. l.P.Seq 


F 


M00005 6 / 5 D : D09 


CH02COH 


1223 


18534 


RTA00002908F.0. 16. 1 .P.Seq 


F 


"fc fAAA^A^" T T"« t AA 

M000225 1 2B : A09 


/— > t t a^i i r t r t 

CHO^MAH 


1224 


1444 


n o™ \ nnAAO aoo i~ 10 i o c ^ ~ 

RTA00002922r.a. 12. l.P.Seq 


r 


M000j8j>89D.D10 


/^*ttaat x.rr 


122^ 


1078 


RTA00002S9 jr. p. 24. 1 .P.Seq 


T? 

r 


x /aaa a no o rrn^ 

M0OOOj»9 /2C:r0/ 


/ — 1 r ta r rAtj 


1220 




A TT \ AAAAO OA/^"C l~ AA 1 A C ^ ^, 

R 1 AOU002o9or.n. 09. l.P.Seq 


r 


X ,f AAA A 1 1 O . A "2 1 

M00004 1 6 j B : C U j 


ATJA 1 AATJ 


122 / 


102)042 


O T™ \ AAAAOA 1 O "C ^ I I | n C ^ 

R 1 A00U029 1 /r.0. 1 1. l.P.Seq 


r; 


X ,f AAA""? T7AC / — ' . A A"2 

M0UU3 279 5 L. : AOj 


ATTAQT \nj 

v^riUoLINrl 


122o 


6878 


A O" v r\A A AO A 1 AC 1 11 t AC -» 

R 1 A00002912F.1. 1 1. l.P.Seq 


r 


X. f AAAOO C 1 . CA 

M000275 1 j>U:r0o 


UrlU4iVlAL 


i n a 

1229 


23639 


Tl T" \ AAAAAA 1 O T** ' f\C 1 A O 

RTA00002912F. i. 05. l.P.Seq 


F 


x <nnAoo">o in da t 

M0002 / j 8 I B : B04 


A TUT A . t X A \ T 

CH04MAL. 


IOTA 

1230 


19635 


r> o" \ aaaaaoa/cit >- i <r i r> o » ^. 

RTA00002896F.C. 1 5. 1 .P.Seq 


r 


* f A A A A ,1 1 f 1 T — \ . O A A 

M 00004 1 44 D : B 0- 


atja i (~\ r t 
L.Uri 


123 1 


7217 


tit \ oA/*\no no t~* _i i o o r*» c _ 

RTA00002926F.d. 18. 2. P. Seq 


F 


X (AAA t AAA in A C 

M00040094B:C0S 


T T A A r vrr 


1232 


4930 


T~% *T* ft AAAATA1 Ar 1 /> 1 1 n C 

RTA000029 jOF.g.04. 1 .P.Seq 


F 


X < A A A J" J" n IA Z - " Pv ^ 

M000558 IOC: DO j 


ATT 1 C /~" /A K T 


1233 


16945 


RTA0000292 IF. o. 01. l.P.Seq 


F 


\ f A A A ^ O A A A \ 1~> 1 A 

M00038290A:D 1 2 


nuAAr v rr 

CH09LNL 


12j4 


24790 


n O^ s AAAAOOAAT ^ O 1 t T> C ^ „ 

RTA00002890F.e.2 1. 1 .P.Seq 


F 


X f A A A A I dAZPi AA/ 

M00001606D:D06 


ATJA1 pAH 


1235 


2272 1 


AO^ \ AAAAOATOt? „ AO 1 A C 

R 1 A00002932r .a. 07. 1 .P.Seq 


T7 

r 


X / AAA/1 O C O \ .DAI 

M000425S6A:B0 1 


An 1 Q f^rWI 

Cri loCUiN 


12jO 


14861 


t-> -t \ AAAAO A A 1 "C ^ 1 0 1 A C -> ^ 

R 1 AU0UU29Ulr.g. lo. l.P.Seq 


c 
r 


X /f AA AA C ;a;d . rr A l 

M 00005505 B :hU 1 




125 / 


O .1 C O 

2432 


DO' \ AAAAO AO 1 C U AO 1 D Can 

K 1 AUUUU292 Ir .0.02. 1 .r.oeq 


rr 

r 


X /f AAA' 1 O 1 AA D • C 1 A 




12jo 


i no /in 1 

19269 


DO" V AAAATOOIE ^ 1 1 1 D C^.^, 

R 1 AUUUU_oo it .p. 1 1. l.P.Seq 


r 


X f AAAA 1 I^AD./^AI 
IVIOOOO 1 4 J»0B . C U 1 


ATJA 1 AATJ 


i m a 
1239 


16029 


AO" \ AAAAOA^ACT 1 1 O 1 D C 

R 1 A00U029jUr.j. 1 /. 1 .P.Seq 


r 


X.fAAAC/lO 1 1 \ . D t~\£. 

M0005 62 44 A : B 06 


Ln 1Dv_Uln 


1240 


3038 


n T" \ nnnmnoot? a f t A C-.„ 

RTA00002922F.m.04. 1 .P.Seq 


F 


X (AAA" 1 A 1 A t 1 — \ T"" A — 

M000j9 12 iD:E0/ 




1241 


2933 


T*% T~ \ AAAATAAAT On inn 

RTA00002922F. g.20. l.P.Seq 


F 


l f AAA A AT f T~\ /— A 1 

MOOOj9056B:G01 


ATTAAr XO 

CH09LNL 


1242 


15956 


n T"* \ AAAAT OA 1 1~ ■ o ^ 1 n C" _ 

RTA00002891F.j.2j. l.P.Seq 


F 


X fAAAA^OZT 1 n 7~> A 

M 0000 j /6 I B:B0_ 


Ann i a att 


1243 


15524 


RTA000029jOF.g. 09. l.P.Seq 


F 


MOOO^^S 13B:D01 


ATT 1 CA*^X' 

CHoCON 


1 ^ AA 
i -144 


2 1 5->u 


K i -AUUUU-V J J r .1.2 1 . 1 .r .oeq 




1V1UUU04 / . L^UV 


ATT i OAAf-TT V 


1245 


17567 


RT A000029 1 SF.p. 1 1 . 1 .P.Seq 


F 


M00033006A:F10 


CH08LNH 


1246 


20293 


RTA00002S8SF,j. 20. l.P.Seq 


F 


M0OOOU77D:G09 


CH01COH 1 


1247 


9520 


RTA00002927F.h. 15. l.P.Seq 


F 


M00039642OF0S 


CH12EDT 


1248 


2700 


RTA00002S89F.e.2 1 . 1 .P.Seq 


F 


M00001539C:F12 


CH01COH 


1249 


25S91 


RT A00002909F.p.23. 1 .P.Seq 


F 


M00022740C:H1 1 


CH03MAH 


1250 


429S 


RT A0000290SF.C.22. 1 .P.Seq 


F 


M000223S3C:A12 


CH03M.AH 



I h'b 
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CLON~E ID 


LIBR \RY 


1 9S 1 


904 1 1 


ot \ nnoo9Q00F e 09 1 p c^n 
ix I .AUUUu-7U7r.c.u-.. i .r.jcq 


r 


A/f0009""> S0HR * POS 




19^9 


1QA 1 1 


pt a nono^QOAF m 91 1 p 


n 
r 


\zT0007^06Q rvf 1 T 


T_TA"^ \,T \ LJ 

UrlUjivlArl 


1 9si 


1 91 1 S 


PTAOO0O90O7F 1 94 n p <v,»n 




\zf0oo9o-t inn - p j i 




1 9-S4 


4Q10 


dt \ hooo9qiof a 04 9 p 


r 


\;fooos s x ! or* • noi 


L-O 1 jCUIN 




1 9DT 9 


PT A OOOO">094F r 16 1 P "nph 




naoooioj. > x r nos 




1 9S6 


i oso i 


PT A 00009Q10F h 9^ 1 p Q^n 


p 


N/f000S60^_LR POO 


v_ rl i jL,L/fN 


1257 


1 1 1 1 4 


DT A0000901SF V 0 1 I P Sen 


p 
i 


M0OOS SO" 1 % A ■ F 1 1 


PFT I 7POf-fT V 
v_ n. I / ^wlLL V 


1258 


64° 6 


DT A00009997F h IS 1 P Sen 


p 


M00O1019^r . C-inQ 


\_- n i. ^. i_.i_7 i 


1259 


990S 


RT A00009904F c 01 1 P Sen 


p 


M00007 1 Q>r^F 1 1 


V • 1 lWi. \ V_7 1 1 


1260 


600 1 

\jyy I 


RT A00009Q0OP j 17 1 p Sen 


F 


MOOOIQOS I R P04 


PH09T N7 

V^, 1 1W 7Li' i_/ 


i ~*6 1 

l-Ul 


1 1 09 9 


RT A 0H009Q06F h OS 1 P Sen 


p 


\/f 0OO^> lQ"ir-RI 1 
IVIUUU— 17 lLu 1 I 


PFT01\/f AH 


1 *L U _ 


9R996 


RT A00009Q07F n 90 1 P Sen 


p 


^00099^6"" R * R06 


PH01IVT AH 


1961 


1 6fKQ 

10UJ7 


PT A 0D009Q1SF i 11 1 P Sen 


p 
* 


\aooos4Q™ sr* • F0 1 


ru l 7POHT V 


1 964 


99^9 


RT A0n009QQ6F \c 94 1 P Q^n 
Ix 1 .nUUw-OOOr.K.-'t.l.r.OCL} 


p 


\A OOOO 1 1 6 x A • A OS 


PRO 1 row 


196S 


40SQ 


RT A nO009Q1SF f 1 Q 1 P Sen 


p 
r 


MOOOS47 1 iR'TilO 


pHi7rnHi v 


1 <!. UU 


9 1 70S 

Z. 1 / 7J 


RT A0D009Q01F h 16 I P Sen 


F 


M0000S44"* A R 1 0 


x^ RU-V- w n 




1 S04Q 


RT A00009Q1SF i 10 ! P Sen 

Ix 1 .nl/UvJU^7 jjr . J . l\J. i . i . JCL| 


F 


A/TOOOS40" "R-FP 


PH17rOHT V 
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RT A Dn009010F r 09 I P Sen 
ix i .-\uuUu«7jur.L.u-. i .r.jCL| 


p 


M00049Q0S A F00 


x^ n l _j v_ w 1 1 


P69 


90401 


RT A00009Q11F i 14 1 P Sen 


F 
i 


M000410^"PT) 1 9 
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909S7 


RT ADn009Q14F n 14 1 P Sen 


F 


\/f 0004140 x C ■ H0S 


rH9oroHT v 


1 97 1 
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id 
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p 
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P 
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p 
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i 7xn^ 


PT A OOOOlQIsF f OS 1 P 

ix. 1 AUUUUiyjjr.I.Uo. 1 .r.jcC| 
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p 
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v_ fl i vjVw wi 
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F 

r 
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ZJ^J 1 
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p 
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PHI 9POP 




9171 1 


RT A 0000'"*010F a LS ~> P Sen 


F 
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ill J v » — ' . ~ 


1 9Q 1 


•4 / O70 : 
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r 


M000970-^ " A F 1 0 


PH04M Al 


1 9Q9 


17S8 1 
Jw3 0 1 
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F 
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PH I SPO\ 

V 11 I O X_ X7 . ~ 
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49 
4Z 


PT A n000^014F -ill IP Sr»n 


p 


\yf 000414"'" A C ! 0 


PR90POH1 V 
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RT A0000^93SF d 14 1 P Sen 


F 


M00054S V3 B01 


CH17C0HLV 


1295 


10449 


RTA00002935F.O.20. 1 .P.Seq 


F 


M00055395D:DU 


CH17C0HLV 


1296 


35359 


RTA00002935F.H. 1 1 . 1 .P.Seq 


F 


M00054SL"D:A11 


CH17C0HLV 


1297 


19657 


RTAOO002935F.1.20. 1 .P.Seq 


F 


M00055lccC:D10 


CH17COKLV 


1298 


12659 


RT A00002930F.1.2 1 . 1 .P.Seq 


F 


M00056133A:EH 


CH15C0N 


1299 


9081 


RTA00002934F.a.22. 1. P.Seq 


F 


M0004364CA:B0l 


CH20COHLV 


1300 


17084 


RTA00002935F.il. 14. 1 .P.Seq 


F 


M000425:C3:H04 


CH17C0KLV 
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1301 


11972 


RTA00002935F.b. 10. l.P.Seq 


F 


M00043336D:B03 


CH17COHLV 


1302 


11077 


RTA00002935F.C. 12. 1 .P.Seq 


F 


M00043402B:G07 


CH17COHLV 


1303 


126414 


RTA00002885F.a.O 1 . i .P.Seq 


F 


M00042350A:A05 


CH16COP 


1304 


113291 


RTA0O0O2935F.m.l5.1.P.Seq 


F 


M00055232A:E08 


CH17C0HLV 


1305 


13224 


RT A 00002935 F. f. 15. 1 .P. Seq 


F 


M00054693A:Ei 1 


CH17COHLV 


1306 


14883 


RTA00002930F.k. 14. 1 .P.Seq 


F 


M00056320B:A03 


CH15C0N 


1307 


13363 


RTA00002935F.a.02. LP. Seq 


F 


M00042352D:B03 


CH17COHLV 


1308 


16869 


RT A00002889F.C.2 1 . 1 .P.Seq 


F 


MOOOOi533C:Gll 


CH01COH 


1309 


16 


RTA00002935F.a. 06. l.P.Seq 


F 


M00042449B:F05 


CH17COHLV 


1310 


4359 


RTA00002903FJ. 16. 1 .P.Seq 


F 


M00006994C:F06 • 


CH02COH 


131 1 


20726 


RT A00002908F.a. 17.1 .P.Seq 


F 


M00022363C:D05 


CH03MAH 


1312 


13713 


RTA00002924F.L09. 1 .P.Seq 


F 


M000396S6C:C0l 


CH09LNL 


1313 


29271 


RT A00002935F.d.20. 1 .P.Seq 


F 


M00054548C:H06 


CH17COHLV 


1314 


6237 


RTA00002935F.L 14. 1 .P.Seq 


F 


M00054686A:F10 


CH17COHLV 


1315 


3472 


RTA00002922F.n. 12. 1 .P.Seq 


F 


M00039134D:F08 


CH09LNL 


1316 


186798 


RTA000029 1 1 F.f. 1 1. l.P.Seq 


F 


M00026914C:H09 


CH04MAL 


1317 


13193 


RTA00002886F.1. 16. l.P.Seq 


F 


M00001369A:G06 


CH01COH 


1318 


3554 


RTA000029 19F.L 14. 1 .P.Seq 


F 


M00033149B:E10 


CH08LNH 


1319 


19991 


RTA00002908F.h. 11.1 .P.Seq 


F 


M00022446C:H06 


CH03MAH 


1320 


173046 


RTA00002901F.0. 19. l.P.Seq 


F 


M00005703D:G10 


CH02COH 


1321 


21798 


RTA00002932F.b. 12. 1 .P.Seq 


F 


M000430I6B:F09 


CHI SCON 


1322 


11303 


RTA00002S9SF.L 11.1 .P.Seq 


F 


M00004359A:E0l 


CH01COH 


1323 


4026 


RTA000029 15F.m.02.2.P.Seq 


F 


M00032494C:H08 


CH08LNH 


1324 


94859 


RTA00002909F.L23. 1 .P.Seq 


F 


M00022656D:D07 


CH03iVLAH 


1325 


12315 


RTA00002907F. m.0 1 . 1 .P.Seq 


F 


M00022240D:B1L 


CH03MAH 


1326 


4822 


RTA00002909F.1. 16. l.P.Seq 


F 


M00022690A:A07 


CH03MAH 


1327 


97129 


RTA00002909F.L 13. 1 .P.Seq 


F 


M000226S4A:E06 


CH03MAH 


1328 


15996 ! 


RTA00002897F.1.09. 1 .P.Seq 


F 


M000042SiA:C04 


CHOICOH 


1329 


7209 


RT A000029 1 SF.c.0 1 . 1 .P.Seq 


F 


M00032S 35 D:G04 


CH08LNH 


1330 


1 1 1888 


RTA00002902F.h.08. 1 .P.Seq 


F 


M0000667SC:C02 


CH02COH 


1331 


15642 


RTA00002902F.2. 06. l.P.Seq 


F 


M00006646A:A07 


CH02COH 


1332 


20016 


RTA00002916F.f.05. l.P.Seq 


F 


M0003256~B:G05 


CH08LNH 


1333 


21603 


RTA00002902F. a. 05. l.P.Seq 


F 


M000057 63 D: AOL 


CH02COH 


1334 


156903 ! 


RTA00002907F.i.09.2.P.Seq 


F 


M00022200B:B05 


CH03MAH 


1335 


1425 


. RTA000029 1 6F.b. 19. 1 .P.Seq 


F 


M00032541C:G03 


CH08LNH 


1336 


186061 


RT A000029 1 1 F.e.24. 1 .P.Seq 


F 


M00026900A:H07 


CH04MAL 


1337 


20717 


RTA00002907F.0. 19. l.P.Seq 


F 


M00022273A:E03 


CH03MAH 


1338 


12586 


RTA00002SS7F.a.09. 1 .P.Seq 


F 


M00001 3S5 A:E07 


CHOICOH 


1339 


19719 


RTA000029l4F.h. 23. l.P.Seq 


F 


M0002S212D:C05 


CH08LNH 


1340 


474 


RTA00002917F.g. 15. l.P.Seq 


F 


M00032727A:E04 


CH08LNH 


1341 


1 1907 


RTA00002923F.O.07. l.P.Seq 


F 


M000393S1C:C07 


CH09LNL 


1342 


6806 


RTA0000292SF.d.02. 1 .P.Seq 


F 


M00040169A:G06 


CH13EDT 


1343 


13146 


RTA00002892F.t\ 10.2.P.Seq 


F 


M00003S14A:G05 


CHOICOH 


1344 


16686 


RTA000029 l9F.f. 14. l.P.Seq 


F 


M00033072A:A09 


CH08LNH 


1345 


6823 


RTA0000288SF.a.04. l.P.Seq 


F 


M00001433B:E02 


CHOICOH 


1346 


43029 


RTA00002S97F.d.03. l.P.Seq 


F 


M00004225D:E03 


CHOICOH 


1347 


147S9 


RT A00002 935F.k.l 1.1. P.Seq 


F 


M00055055C:F01 


CH17C0HLV 


1348 


186061 


RTA0000291 IF.f.OL l.P.Seq 


F 


M00026900A:H07 


CH04MAL 


1349 


12823 


RTA0000292 lF.g.24. l.P.Seq 


F 


M00033454D:F05 


CH09LNL 


1350 


25S44 


RTA0000290SF.k.23. l.P.Seq 


F 


M000224"4B:C0S 


CH03MAH 
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LIBRARY 
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CH09LNL 


1475 


17330 


RTA000029 15F.a.03. 1 .P.Seq 


F 


M000^S616CD09 

x w x\y vy vy _ vy vy x v • ^y vy -y 


CH08LNH 


1476 


25620 


RTA00002902F.f.09. 1 .P.Seq 


F 


M0000663 1C:A04 


CH02COH 


1477 


20601 


RTA00002923F.I.20. 1 .P.Seq 


F 


M00039326A:G07 


CH09LNL 


1478 


6205 


RTA0000°9°3F ° n i 1 P Sea 

l\ x . v vy vy vy vy — ^ x • • a. * x * x * v* vj 


F 


M00039°5SC-C01 

x» x vy vy vy *y y 7 *----y v- x^ * v_^ vy x 


CH09LNL 


1479 


726 


RTAXXKXP913F b 16 1 P Sea 

xv x * xvy vy vy vy — y x ^y x . vy . x vy * x • x - jvu 


F 


M000°773-D-C0" 

x~ x vy vy vy — / / «,y . x— ' • x*- vy 


CH04MAL 


1480 


104999 


RTAOOOO^OSF cr 17 1 P Sea 

i\ x * v vy vy vy vy vy vj x ■ — , > x / • x • x . v« vj 


F 


M000°" , 4" ^3 GP 

it x vy vy vy ^ — >^ xy . vj i — 


CH03MAH 

v_^ x xv/^y x ▼ xx xj. x 


1481 


JUJw 1 


RT A0000°9 19F o 17 1 P Sea 

XX X • » \J\J\J\J S X _y X . VJ - 1 / . 1.x . JLU 


p 


M0OO VP 6-1 R -F06 


CH08LNH 

v_ x xvy Vy X— i ~ x x 


1482 


5878 


RT A0000°9 13F a 16 1 P Sea 

xx x **. vy vy vy vy — . ^y l x • xx • x vy . i . x . u v. u 


F 


MOOO^ 76S S C -CO 1 

x ▼ x vy vy vy / vy vj v_ x_^ » x^ vy x 


CH04M.AL 


1483 


5944 


RT A0000^905F m 07 1 P Sea 

XX X * V\J\J\J\J mm Wta/ X ■ill* vy / . x . x . J^U 


F 


M000^ [ 64° B • A0° 

XTXvyvyvj»» t - lj • . v. vy 


CH03MAH 

Vy x iv j x™ iru x 


1484 


5796 


RT AO000 n 9O8F i ^ 1 i P Sea 

xx x m v vy vy vy vy — ,y vy VJ x - 1 - — 1 . x * x . u vu 


F 


M000' 7O 4^'" A-G05 

x» xvy vy vy— _ _ v,x_jvy— j 


CH03MAH 

x.^ x x vy *y x » x* xx. x | 


1485 


3804 


RTA00002935F.rn. 24. l.P.Seq 


F 


M000 , S5* > 54 A-H03 

x~xvy vy vy^.' — — ■ - x • x xvy _ t 


CH17COHLV 


1486 


2728 


RTA000029 1 8F.a.22. 1 .P.Seq 


F j 


M00032S2SA.-A06 


CH08LNH 


1487 


3804 


RTA00002935F.n.0 1 . 1 .P.Seq 


F 


M00055254A:H03 


CH17COHLV 


1488 


3932 


RTA00002915F.o.l9.2.P.Seq 


F 


M000325 l'C.ElO 


CH08LNH 


1489 


16691 


RTA0000°S91F o 03 i P Sea 

XX x i. v vy vy vy vy »^uy ii . vy • vy — ■ i • x . J vu 


F 


M0000 wSC A-G01 

X^ x J / U V • X i V-JJ vy x 


CH01COH 


1490 


15430 


RT A0000 o 900F <? in I p Sea 

XX x 4 x v/ vy vy w — y vy vy x • ■ x vy • x . x •uv>u 


F 


MOOOOSOO^D-CO 0 

itxv/ vy vy vy «. 'vyvy- i — ' - * — - ^y — 


CHO^COH 

V-^ X X VJ — \w VJ X A 


1491 


5637 


RT A0000'^9' 7 5F b 18 1 P Sea 


F 


M0OO^98^0B-F06 

IT 1 Vy Vy Vy VJ — V. i-y .X \J\J 


CH09LNL 


1492 


16633 


RT A0000' 7 S97F cr 15 l p Sea 

XX X . V UUUU«.(jy / 1 . SI . X _y • X ■ X 


F 


M00004^4c3*H07 

It XVyVyVJVy'— ~v_ ; — > . 1 x vy / 


CH01COH 

v_- x xvy x x^ V-/ x x 


1493 


2 1826 


RTA0000°898F cr 06 1 P Sea 

XX X v_/vyvyvy_o_yv_jx . . v_/ vj - x.x • u 


F 


M00004 >4- A-G 1 I 

X'lVJvy VJvJ^- . v . v — i l x 


CH01COH 

v — ' x x vy x x— v_/ x x 


1494 


22193 


RTA00002919F. i.09. l.P.Seq 


F 


M00033l4cD:A03 


CH08LNH 


1495 


10720 


RTA0000289SF.C. 14. 1 .P.Seq 


F ! 


M0000432CC:E07 


CH01COH 


1496 


22491 


RTA00002925F.m.06. 1 .P.Seq 


F 


M00040003A:G10 


CH09LNL 


1497 


10423 


RT A000029 1 5F. n. 1 3 .2 . P.Seq 


F 


M0OO325O"D:GO8 


CH08LNH 


1498 


4953 


RTA000029 1 6F.h. 1 1 . 1 .P.Seq 


F 


M000325ScC:B04 


CH08LNH 


1499 


185567 


RTA000029 1 lF.p.08. 1 .P.Seq 


F 


M0002717S3:All 


CH04M.AL 


1500 


25605 


RTA00002924F.m.22. l.P.Seq 


F 


M000397lC3:AOl ! 


CH09LNL 
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SEQ 
ID j 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1501 


29446 


RTA00002906F.m.24. 1 .P.Seq 


F 


M00022070B:B04 


CH03MAH 


1502 


9668 


RTA00002908Rg.02. 1 .P.Seq 


F 


M00022421A:F12 


CH03MAH 


1503 


29446 


RTA00002906Rn.01 . 1 .P.Seq 


F 


M00022070B:B04 


CH03MAH 


1504 


7171 


RTA00002887Rm.22. 1 .P.Seq 


F 


M00001421B:E07 


CH01COH 
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Table 3 





Nearest Neiehbor fBiasiN vs. GenbanJc) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


p value 


1 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




<iN VJlN to 


2 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




<IN<JiNfc> 


3 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


^-•^ y^j i ^ iz.-^ 




4 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




VmLNUlNCO 


5 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




6 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<rNDlvrP^ 


7 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




8 


<NONE> 


<NONE> 


<NONE> 


<NONE> 






9 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE:> 


<NONF.!> 


10 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


LI 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




12 


<NONE> 


<NONE> 


<NONE> 


<NONE> 






13 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




14 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


15 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


16 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


17 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


IS 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


19 


<NONE> 


<NONE> 


<NONE> 








20 


<NONE> 


I <NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


21 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


22 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


23 


<NONE> 


<NONE> 


<NONE> 


54S562 


GENOME POL YPROTE LN 
[CONTAINS: RNA 
REPLICASE ; HELICASE: 
COAT PROTEIN] 2.7.7.48) - 
apple stem grooving virus 
(strain P-209) 


9.2 


24 


<NONE> 


<NONE> 


<NONE> 


416959 


EXCISION REPAIR PROTEIN 
ERCC-6 DNA repair helicase 
ERCC6 - human >gi| 1 82 i S 1 
(L0479I) excision repair protein 
'Homo sapiens] 


8.9 


25 


<NONE> 


<NONE> 


<NONE> 


3327096 


(AB014541) KIAA0641 protein 
'Homo sapiens] 


8 .T 


26 


<NONE> 


<NONE> 


<NON r E> 


861293 


(U28741) F35D2.1 gene 
product fCaenorhabditis 
elesans] 


7.9 


27 


<NONE> 


<NONE> 


<NONE> 


3297821 


(AL031O32) extensin-Hke 
protein 


5.5 


28 


<NONE> 


<NONE> 


<NONE> 


21 19692 


transforming growth factor- beta 
type III receptor - chicken 
>gi|5 11843 (LOl 121) 
transforming growth factor- beta 
type III receptor [Galium gallus] 


5.1 


29 


<NONE> 


<NONE> . 


<NONE> 


213602S 


protein kinase PRKl - human 


5.0 

















Ho 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















30 


<NONE> 


<NONE> 


<NONE> 


2746912 


f AF040659 > i No definition line 
found [Caenorhabditis eiegans] 


4.6 


31 


<NOKE> 


<NONE> 


<NONE> 


2358287 


fAFOICM-CW^ ALR fHomo 
sapiens] 


4.5 


32 


<NONE> 


<NONE> 


<NONE> 


38778 16 


(Z96048) predicted using 
Genefinder; cDNA EST 
EMBL.D65516 comes from this 
gene; cDNA EST ykl91a5.5 
comes from this sene 
[Caenorhabditis eiegans] 


4.4 


33 


<NONE> 


<NONE> 


<NONE> 


4140268 


(Y 14953) SRCR domain, 
membrane form 2 


4.1 


34 


<NONE> 


<NONE> 


<NONE> 


1708663 


(U51 183) transposase [Hydra 
vulgaris] 


4.0 


35 


<UN UTN c> 


<+n ui^ n-** 




I 1 84 1 00 


(U45958) pistil extensm-like 
nrnffin TNicotiana alatal 


3.9 


36 


<NONE> 


<NONE> 


<NONE> 


121073 


GLUCOCORTICOID 
RECEPTOR (GR) 


3.9 


37 


<NONE> 


<NONE> 


<NONE> 


1718298 


(U75698) ORF 45; contains an 
extended acidic domain; EBV 
BKRF4 homolog [Kaposi's 
sarcoma- associated herpesvirus] 
homolog, conserved in other 
gamma- herpesviruses 


2.6 


38 








i— -J J OQ 


(AF006564) alcohol 
dehydrogenase [Drosophila 

nprcim il i<l npr^imili^l 

yf K~ I i 11 1 1 1 I IS ] p^l SliULIlO 1 


1.4 


39 


<NONE> 


<NONE> 


<NONE> 


3192897 


(AF066071)SP85;PsB 
[Dictyostelium discoideum] 


1.4 


40 


<NONE> 


<NONE> 


<NONE> 


561645 


(L33421) This CDS feature is 
included to show the translation 
of the corresponding V_region. 
Presently translation qualifiers 
on V_region features are illegal 


1.0 


41 


<NONE> 


<N0NE> 


<NONE> 


3878S57 


(Z8.ii.iif) predicted using 
Genefinder; cDNA EST 
EMBL:D35016 comes from this 
gene; cDNA EST 
EMBL.D32583 comes from this 
gene; cDNA EST 
EMBL.-D35258 comes from this 
gene; cDNA EST 
EMBL-.Cl 1471 comes from this 
gene; cDNA EST EMBL:C... 


1.0 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(U75903) UGT1A7 [Rattus 




42 


<NONE> 


<NONE> 


<NONE> 


1658571 


norvegicus] 


1.0 


43 


<NONE> 


<NONE> 


<NONE> 


2338034 


(AF005370) putative immediate 
early protein [Alceiaphine 
herpesvirus 1] 


0.36 


44 


<NONE> 


<NONE> 


<NONE> 


3043714 


(AB011167) KIAA0595 protein 
[Homo sapiens] 


0.42 


45 


<NONE> 


<NONE> 


<NONE> 


1723710 


HVPUlRbllLAL yZ/ JSJU> 
PROTEIN IN ASN2-PHB 1 
INTERGENIC REGION 
>git2131678|pir||S64439 
hypothetical protein YGR130c - 
yeast (Saccharomyces 
cerevisiae) 

>gi|1323215|gnl|PIDte243523 
(Z72915) ORF YGR130c 
[Saccharomvces cerevisiae] 


0.40 


46 


<NONE> 


<NONE> 


<NONE> 


1723710 


HYFOTHKl ICAL 92. 1 KD 
PROTEIN IN ASN2-PHB 1 
INTERGENIC REGION 
>gi|213167S|pir||S64439 
hypothetical protein YGR130c - 
yeast (Saccharomyces 
cerevisiae) 

>gi| 1 3232 1 5 1 gnl |PID|e2435 23 
(Z72915) ORF YGR130c 
[Saccharomvces cerevisiae] 


0.38 


47 


<NONE> 


<NONE> 


<NONE> 


2996117 


(AF046125) immediate early 2 
[Rat cytomegalovirus] 


0.26 


48 


<NONE> 


<NONE> 


<NONE> 


4151809 


(AF102855) synaptic SAPAP- 
interacting protein Synamon 


0.024 


49 


<NONE> 


<NONE> 


<NONE> 


2773341 


(AF040954) putative protein 
phosphatase 1 nuclear targeting 
subunit [Rattus norvegicus] 


0.017 


50 


<NONE> 


<NONE> 


<NONE> 


1653522 


(D90914) hvpothetical protein 


3e-04 


51 


<NONE> 


<NONE> 


<NONE> 


3219965 


HYPOTHETICAL 100.6 KD 
TRP-ASP REPEATS 
CONTAINING PROTEIN 
C2C6.04C IN CHROMOSOME 
I 


3e-06 


52 


<NONE> 


<NONE> 


<NONE> 


4185567 


(API 15480) cAMP-dependent 
Rapi guanine-nucleotide 
exchange factor [Mus musculus] 


7e-07 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPOTHETICAL 43.2 KD 




53 


<NONE> 


<NONE> 


<NONE> 


1176527 


PROTEIN C34E10.1 IN 
CHROMOSOME III 
>gi|500724 (U10402) C34E10.1 
gene product [Caenorhabditis 
eiegans] 


3e-20 


54 


XS5444 


G.pallida repetitive 
DNA element 


5.0 


2118936 


beta-globin - chimpanzee 
(fragment) 


8.6 


55 


X7296L 


Synechococcus sp. 
cpeB, cpeA genes and 
ORF3 


5.0 


462569 


MICROTUBULE- 
ASSOCIATED PROTEIN 1A 
microtubule-associated protein 
MAPI A - rat >gi|205538 
norvegicus] 


2.2 


56 


U94747 


Human WD repeat 
protein HAN11 
mRNA. complete cds 


5.0 


3875538 


(Z67990) similar to cuticle 
collagen 


1.3 


57 


AFO32108 


Homo sapiens 
integrin alpha-7 
mRNA, complete cds 


5.0 


2147194 


collagen - Paralvinella grasslei 


0.002 


58 


Z50798 


G.gallus mRNA for 
p52 


5.0 


3122885 


ASPARTYL-TRNA 
SYNTHETASE synthetase 
[Bacillus subtilis] 


3e-ll 


59 


AB002384 


Human mRNA tor 
KIAA0386gene, 
complete cds 


5.0 


2632098 


(Y 155 13) Prodos protein 
[Drosoohila melanogaster] 


9e-12 


60 


X14835 


Thermofilum pendens 
DNA for 16S and 
23S ribosomal RNA, 
tRNA-Met, and tRNA 
Gly 


4.9 


<NONE> 


<NONE> 


<NONE> 


iL 1 

ol 


U87I49 


Hordeum vulgare 
nucellin gene, 
complete cds 


4.9 


128578 


NONSTRUCTURAL 
PROTEIN NS-S spotted wilt 
virus (strain CPNH1) non- 
structural protein [Tomato 
spotted wilt virus] 


2.8 


62 


DS7541 


Mus musculus gene 
for integrin alpha v 
subunit, promoter 
region 


4.9 


136956 


HYPOTHETICAL PROTEIN 
UL61 cytomegalovirus (strain 
AD 169) cytomegalovirus] 


0.038 


63 


U72520 


Mus musculus mena 
protein (Mena) 
mRNA, complete cds 


4.9 


3413892 


(AB007934) KIAA0465 protein 
[Homo sapiens! 


6e-07 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















64 


S79797 


enzymaac 
glycosylation- 
regulating gene [rats, 
Sprague-Dawley, 
streptozotocin 
diabetic, heart, 
mRNA, 5010 nt] 


4.8 


<NONE> 


<NONE> 


<NONE> 


65 


AB011102 


Homo sapiens mRNA 
for'KIAA0530 
protein, partial cds 


4.8 


138022 


RECEPTOR RECOGNIZING 
PROTEIN gp38 - phage Ox2 
>gi|15126 (X05675) gene 38 
(AA 1-266); pid:gl5 126 
[Bacteriophage Ox2] 


3.6 


66 


AF 100985 


Penaeus monodon 
phosphopyruvate 
hydratase mRNA, 
complete cds 


4.8 


500615 


(D 16221) endochitinase [Oryza 
sativa] 


2.8 


67 


U31756 


Bacillus subtilis 
gamma- 
aminobutyrate 
permease cds 


4.8 


3880699 


(AL02 1471) similar to 
Eukaryotic aspartyl proteases 
[Caenorhabditis elegans] 
Eukaryotic aspartyl proteases 
[Caenorhabditis elegans] 


2.8 


68 


U25111 


Pisum sativum 
chloroplast 
processing enzyme 
mRNA, nuclear gene 
encoding chloroplast 
protein, complete cds. 


4.8 


1800145 


(U83658) FH1/FH2 protein 
homolog [Emericella nidulans] 


1.6 


69 


U00454 


Mus musculus Cdx-2 
homeobox protein 
gene, complete cds. 


4.7 


<NONE> 


<NONE> 


<NONE> 


70 


MS4166 


Hamster c-Ha-ras 
protein gene, 
complete cds. 


4.7 


1710606 


REN IN- BINDING PROTEIN 
(RNBP) protein [Rattus 
norvegicus] 


0.88 


71 


AF087516 


Mus musculus major 
sperm fibrous sheath 
protein Pro- 
mAKAP82 gene, 
alternative splice 
exons 1' and 1" 


4.6 


<NONE> 


<NONE> 


<NONE> 


72 


X74160 


M.esculenta mRNA 
for granule-bound 
starch svnthase 


4.6 


<NONE> 


<NONE> 


<NONE> 


73 


M97487 


Haloferax volcanii 
superoxide dismutase 
(sod2) gene, complete 
cds. 


4.6 


2623307 


(AC002409) putative ubiquitin 
protease [ Arabidopsis thaliana] 


3.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Drosophila 










HA 


M57889 


melanogaster 
suppressor of sable 
gene, complete cds. 


4.5 


<NONE> 


<NONE> 


<NONE> 


-T< 

/ 3 


D49708 


Rattus norvegicus 
mRNA for KNA 
binding protein 


4.5 


<NONE> 


<NONE> 


<NONE> 


/O 


D31853 


Yeast GTS 1 gene for 
glycin-threonin/serine 
repeat protein, 
complete cds 


4.5 


2447195 


(U42580) NETTF (7x), DETTS 
(4x) [Paramecium burs aria 
Chlorella virus 11 


3.3 


/ / 


Z47036 


Human partial cDNA 
sequence, clone 
bs613; 


2.9 


<NONE> 


<NONE> 


<NONE> 


78 


LI 9660 


Rattus norvegicus 
gastric inhibitory 
peptide receptor 
mRNA, complete cds 


2.7 


2358279 


(AF007871) torsinA [Homo 
sapiens] 


2e-07 


79 


X82841 


A.thaliana Aco 2ene 


2.6 


483212 


immediate-early protein IE1 10 - 
human herpesvirus 1 (strain 
HFEM) (fragment) 


8.4 


80 


X61931 


S.purpurascens famA 
and famB genes for 
FAS domain and acyl- 
Co A-dehydrogenases , 
respectively 


2.6 


2290534 


(U95031) sublingual gland 
mucin [Homo sapiens] 


0.47 


8 1 


U 13680 


Human lactate 
dehydrogenase-C 
(LDH-C) mRNA, 
complete cds. 


2.5 


2887449 


(AB007874) KIAA0414 [Homo 
sapiens] 


3.1 


82 


AB 007 8 69 


Homo sapiens 
KIAA0409 mRNA, 
partial cds 


2.4 


3130157 


(AB008859) pheromone 
receptor [Fugu rubripes] 


J .*T 


83 


X97479 


H.sapiens mas proto- 
oncogene, 5* region 


2.1 


<NONE> 


<NONE> 


<NONE> 


84 


X98374 


R.norvegicus mRNA 
for KIS protein 


1.9 


<NONE> 


<NONE> 


<NONE> 


85 


AE000710 


Aquifex aeolicus 
section 42 of 109 of 
the complete genome 


1.9 


<NONE> 


<NONE> 


<NONE> 



[is 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens mRNA 










86 


IZ 


for repressor protein, 
partial cds 


1 Q 


"^vi^i \jr* 






87 


Y 14321 


Homo sapiens 
PMP69 gene, exons 
8,9,10 & 11 


1.9 


<NONE> 


<NONE> 


<NONE> 


88 


D90773 


Kohara clone 
#262(30.3-30.5 min.) 


1.9 


1536816 


(D78305) DNA binding protein 
[Chlorella virus] 


7.9 


89 




Archaeoglobus 
fulgidus section 116 
of 172 of the 
complete genome 


1 Q 
L.y 




(X79095) 

pyruvate,orthophosphate 

Hil'inocfi rPlai/pn!i t~r"l npr\/i r> 1 
UlKJlluoC [riaVClla U 111C1 viuj 


7 7 


90 


u jy4 /o 


Rattus norvegicus 
p95 Vav (Vav) proto- 
oncogene mRNA. 
complete cds. 




J.1 ^£17R 
*tl JOl f 0 


(AL023496) hypothetical 
pruiein 


1 6 


91 


U28838 


Human transcription 
factor TFIIIB 90 kDa 
subunit 


1.9 


2495730 


HYPOTHETICAL PROLINE- 
RICH PROTEIN KIAA0269 
>gi| 1 665805|gnl|PID|d 1 0 1 4089 
(D87459) Similar to Volbox 
carteri extensin (S22697) 
[Homo sapiens] 


0.23 


92 


U20106 


Rattus norvegicus 
synaptotagmin VII 
mRNA. complete cds. 


1.9 


478380 


UL47h protein * Marek's disease 
virus 


0.23 


93 


AF071010 


A/fnn^p mammarv 

tumor virus putative 
integrase, env 
polyprotein, and 
superantigen mRNA, 
complete cds 


1.9 


. 2781386 


(AC004010) similar to Leucine- 
rich transmembrane proteins; 
44% similarity to U42767 
(PID:g 17369 18) [Homo 
sapiens] 


4e-33 


94 


AF06I881 


Mesocricetus auratus 
c-fos proto-oncogene 
protein (c-fos) gene, 
complete cds 


1.8 ! 


<NONE> 


<NONE> 


<NONE> 


95 


AE001397 


Plasmodium 
falciparum 
chromosome 2, 
section 34 of 73 of 
the complete 
sequence 


1.8 1 


<NONE> 


<NONE> 


<NONE> 



[t<9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Horseshoe crab 










96 


D 14701 


mRNA for 
coagulation factor B, 
complete cds 


1.8 


<NONE> 


<NONE> 


<NONE> 


97 


M29154 


P. falciparum 
multidrug resistance 
(MDR) gene, 
complete cds. 


1.8 


<NONE> 


<NONE> 


<NONE> 


98 


L16532 


Rattus norvegicus 
(cione pCNPII) 2\3 - 
cyclic nucleotide 3'- 
phosphodiesterase 
(CNPII) mRNA, 
complete cds. 


1.8 


• 

<NONE> 


<NONE> 


<NONE> 


99 


AE001434 


Plasmodium 
falciparum 
chromosome 2, 
section 71 of 73 of 
the complete 
sequence 


1.8 


<NONE> 


<NONE> 


<NONE> 


100 


Z46785 


D.melanogaster gene 
for protamine 
(mst35Bb). 


1.8 


<NONE> 


<NONE> 


<NONE> 


101 


X69822 


P.syivestris mRNA 
for glutamine 
synthetase 


1.8 


219896 


(D90452) 1-caldesmon I [Homo 
sapiensl 


9.7 


102 


U49055 


Rattus norvegicus 
CTD-binding SR-Iike 
protein rA8 mRNA, 
complete cds 


1.8 


2497252 


INSULIN -LIKE URUW l'H 
FACTOR BINDING PROTEIN 
4 (IGFBP-4) (IBP-4) (IGF- 
BINDtNG PROTEIN 4) factor- 
binding protein-4 - sheep 
(fragment) factor- binding 
protein-4, IGFBP-4 [sheep, 
liver, Peptide, 237 aa] [Ovis 
aries] 


2.5 


103 


L28101 


Homo sapiens 
kallistatin (PI4) gene, 
exons 1-4, complete 
cds 


1.8 


4204267 


(AC005223) 55585 
[Arabidopsis thaliana] 


2.4 


104 


U66987 


Pandorina morum 
interna! transcribed 
spacer 1, 5.8S 
ribosomal RNA gene, 
and internal 
transcribed spacer 2, 
complete sequence 


1.8 


2635909 


(Z99121) permease [Bacillus 
subtilis] 


1.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins ) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human polymorphic 










105 


X58033 


Mspl site DNA 
(D3S3 locus) 


1.8 


2136878 


keratin KAP5.5 - sheep 
(fragment) >gi|3 13722 


0.65 


106 


U15780 


Human p82 (ST5) 
mRNA, alternatively 
spliced, complete cds 


1.8 


3638957 


(AC004877) sco- spondin- mucin- 
like; similar to P98167 uncertain 
[Homo sapiens 1 


0.64 


107 


AF038535 


Homo sapiens 
synaptotagmin VII 
mRNA, partial cds 


1.8 


457927 


(U00690) calcium channel alphai 
1 subunit [Drosophila 
melanogaster] 


0.51 


108 


AF052134 


Homo sapiens clone 
23585 mRNA 
sequence 


1.8 


232263 


HOMEOBOX PROTEIN HOX- 
Dl (HOX-4.9) 


0.28 


109 


X75208 


H.sapiens HEK2 
mRNA for protein 
tyrosine kinase 
receptor. 


1.8 


1730198 


GROWTH- ARREST-SPECIFIC 
PROTEIN 1 gene product 
[Homo sapiens] 


0.22 


110 


AB013896 


Xenopus laevis 
mRNA for SOX-D, 
complete cds 


1.8 


2494501 


TRANSCRIPTION FACTOR 
FKH-4 factor [Mus musculus] 


0.17 


111 


D I 6947 


Human HepG2 3' 
region cDNA, clone 
hmd6bl0 


1.8 


3413870 


(AB007923) KIAA0454 protein 
[Homo sapiens] 


0.002 


112 


D I 3547 


Mouse DNA, T early 
alpha (TEA) region 


1.8 


3393018 


(AL031174) hypothetical 
protein 


5e-08 


113 


M35498 


Woodchuck c-myc 
protein gene, exon 1. 


1.8 


3183405 


HVPOlRbllCAL 11.3 kL> 
PROTEIN C2C6.07 IN 
CHROMOSOME I 
>gi|2370504jgnl|PID|e339194 
pombe] 

>gi|3451305|gnl|PID|el3 16730 
(AL031324) very hypothetical 
protein [Schizosaccharomyces 
pombe] 


Se-10 


114 


M84166 


Hamster c-Ha-ras 
protein gene, 
complete cds. 


1.8 


3386622 


(AC004665) unknown protein 
[Arabidopsis thaliana] 


2e-10 


115 


U33135 


Mychodea carnosa 
18S ribosomal RNA 
gene, complete 
sequence 


1.8 


3334982 


(AC005306) R27216_l [Homo 
sapiens] 


3e-22 


116 


U84003 


Homo sapiens 
putative tumor 
suppressor (BIN1) 
gene, exons 7-12 


1.7 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 






P VAT FTP 




DESCRIPTION 


P V AT T FP 
















117 


AE001121 


Borrelia burgdorferi 
(section 7 of 70) of 
the complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


113 


AE00U14 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


i.7 


<NONE> 


<NONE> 


<NONE> 




U82064 


Angiostrongyius 
canto nens is adult- 
specific muscle 
protein- 1 gene, partial 
cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


120 


AF041836 


Buchnera aphidtcola 
plasmid pLeu-Sg, 
complete plasmid 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


121 


M87479 


Lymnaea stagnalts 
FMRFamide gene, 
mature peptides. 


1,7 


<NONE> 


<NONE> 


<NONE> 




M55163 


Xenopus Iaevis 
fibroblast growth 
factor receptor 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


12J 


S57565 


histamine H2- 
receptor [rats. 
Genomic, 1928 nt] 


1.7 


<NONE> 


<NONE> 


<NONE> 


124 


M27256 


Simian 

immunodeficiency 
virus (SIV) pol 
region. 


1.7 


<NONE> 


<NONE> 


<NONE> 


125 


U31516 


Human chromosome 
8 anonymous clone 
pBS8-165 


1.7 


<NONE> 


<NONE> 


<NONE> 


126 


X 12671 


Human gene tor 

heterogeneous 

nuclear 

ribonucleoprotetn 
(hnRNP) core protein 
Al 


1.7 


<NONE> 


<NONE> 


<NONE> 


127 


AF009054 


Paeonia suffruticosa 
ssp. spontanea 
alcohol 

dehydrogenase IB 
(AdhlB) gene, partial 
cds 


1.7 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















128 


AF046917 


Mus musculus 
transketolase gene, 
exon 6 and partial cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


129 


D89053 


Homo sapiens mRNA 
for Acyl-CoA 
synthetase 3, 
complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


130 


U57968 


Staphylothermus 
mar inns surface layer- 
associated STABLE 
protease gene, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


131 


L39072 


Bovine herpesvirus 1 
(clone p95) UL24 
homologue gene, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


132 


X04980 


Drosophila simulans 
retrotransposon 297 
5*-LTR and flanks 
(pWK1020) 


1.7 


<NONE> 


<NONE> 


<NONE> 


133 


AE001U4 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


134 


X04434 


Human mRNA for 
insulin-Uke growth 
factor I receptor 


1.7 


<NONE> 


<NONE> 


<NONE> 


135 


U07890 


Mus musculus 
C57BL/6J epidermal 
surface antigen 
(mesa) mRNA, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


136 


D26163 


Human tyrosinase 
gene, 5-flanking 
region cell-specific 
transcription) 


1.7 


<NONE> 


<NONE> 


<NONE> 


137 


AF0938L8 


Panorpa nipponensis 
NADH 

dehydrogenase 
subunit 5 gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


1.7 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor CBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Xenopus laevis 










138 


D50560 


mRNA for 
cytochrome P-450, 
complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


139 


AF083488 


Mus musculus 
phospholipase D 1 
(PLDI) gene, exons 
18 and 19, complete 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


140 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


. 1.7 


• <NONE> 


<NONE> 


<NONE> 


141 


M73749 


Streptococcus 
salivarius 

thermophilus beta-D- 
gaiactose (lacZ) gene, 
complete cds. > :: 
gb|M63636|STRLAC 
ZZ Streptococcus 
thermophilus beta-D- 
galactosidase (lacZ) 
gene, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


142 


AE001114 


Archaeogiobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.7 


2183023 


(U84971) unknown [Homo 
sapiens] 


9.2 


143 


L01983 


Human type [V 
sodium channel alpha 
polypeptide 


1.7 


130504 


GENOME POL i 1 PRU 1 blN 
[CONTAINS: N- TERMINAL 
PROTEIN (PI); HELPER 
COMPONENT PROTEINASE 
INCLUSION PROTEIN (CI); 6 
KD PROTEIN 2 (6K2); ! 
GENOME-LINKED PROTEIN 
(VPG); NUCLEAR ... virus 
(strain D) 


9.2 


144 


L19731 


Plecotus rafinesquii 
mitochondrial 
cytochrome b gene. 5' 
end. 


1.7 


3327096 


(AB014541) KIAA0641 protein 
[Homo sapiens] 


9.1 


145 


AE001114 


Archaeogiobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.7 


2183023 


(U84971) unknown [Homo 
sapiens] 


8.S 



l si 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















146 


L27218 


Bos taums serum 
amine oxidase 
mRNA, complete cds. 
> oxidase=amiloride- 
binding protein 
homolog [cattle, liver, 
raRNA, 2664 nt] 


1.7 


1174459 


SIGNAL TRANSDUCER AND 
ACTIVATOR OF 
TRANSCRIPTION 6 (IL-4 
STAT) >gi|559855 (U16031) IL- 
4 Stat [Homo sapiens] 


7.1 


147 


Z49868 


Caenorhabditis 
elegans cosmid 
W07E 11, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


4204263 


(AC005223) 40409 
[Arabidopsis thaliana] 


6.7 


148 


AL022271 


Caenorhabditis 
elegans cosmid 
F32F2, complete 
sequence 
[Caenorhabditis 
elesans] 


1.7 


2497969 


PERIPLASMIC NITRATE 
REDUCTASE PRECURSOR 
>gi|1086107|pir||S50163 nitrate 
reductase large chain precursor, 
periplasmic - Thiosphaera 
pantotropha >gi|600093 
(236773) periplasmic nitrate 
reductase large subunit 
[Paracoccus denitrificans] 


6.7 


149 


U43844 


Mus musculus cyclin 
D3 gene, complete 
cds 


1.7 


3861490 


(AF062037) capsid protein 
precursor [Thosea asisna virus] 


5.1 


150 


Z25464 


S.cerevisiae UNFI, 
LTV1,MRP8,CYB3 
and TGL 1 genes, 
complete CDS's 


1.7 


1255404 


(U53151) weak similarity to 
cytochrome b [Caenorhabditis 
elegans] 


4.1 


151 


U77846 


Human elastin gene, 
partial cds and partial 
3'UTR 


1.7 


3355682 


(AL031 124) putative secreted 
lyase 


4.0 


152 


X62880 


S.scrofa mRNA for 
calcium release 
channel (CRC) 


1.7 


3327080 


(ABO 14533) KIAA0633 protein 
[Homo sapiens! 


4.0 


153 


Y00067 


Human gene for 
neurofilament subunit 
M (NF-M) 


1.7 


479829 


heterogeneous ribonuclear 
particel protein homolog - 
Caenorhabditis elegans 
similarity to RNA recognition 
motifs [Caenorhabditis elegans] 


3.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















154 


X68393 


D.melanogaster gene 
for Beta-tubulin, 
exons 1 and 2 


1.7 


2342682 


(AC000106) Contains similarity 
to Rattus AMP-activated protein 
kinase (gb|X95577). 
[Arabidopsis thaliana] 


3.8 


155 


ABO 12284 


Shuttle vector 
pAUR123 gene for 
Autl-C, complete cds 


1.7 


417704 


POL POLYPROTEIN 
(ORF1A/1B) (CONTAINS: 
RNA-D ERECTED RNA 
POLYMERASE ; HELICASE; 
PROTEASE ] 


3.8 


156 


M96633 


Rattus norvegicus 
mitochondrial 
intermediate 
peptidase (MIP) 
mRNA, complete cds. 


1.7 


2314209 


(AE000613) H. pylori predicted 
coding reaion HP 1054 


3.1 


157 


U49055 


Rattus norvegicus 
CTD-binding SR-like 
protein rA8 mRNA, 
complete cds 


1.7 


2497252 


lNbULlN-LlKJb OKUW 1H 
FACTOR BINDING PROTEIN 
4 (IGFBP-4) (IBP-4) (IGF- 
B ENDING PROTEIN 4) factor- 
binding protein-4 - sheep 
(fragment) factor-binding 
protein-4, IGFBP-4 [sheep, 
liver, Peptide, 237 aa] [Ovis 
aries] 


3.0 


158 


Y 15907 


Mus musculus mRNA 
for myc-intron- 
binding protein- 1 


1.7 


912776 


iduronate-2-sulfatase, IDS {EC 
3.1.6.13} Peptide Mutant, 550 
aal 


3.0 


159 


U67600 


Methanococcus 
jannaschii section 142 
of 150 of the 
complete genome 


1.7 


2982355 


( AF052252) fork head domain 
protein FKD9 [Danio rerio] 


3.0 | 


160 


AF013759 


Homo sapiens 
calumein (Calu) 
mRNA, complete cds 


1.7 


2982355 


(AF052252) fork head domain 
protein FKD9 [Danio rerio] 


2.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












llmiij.il utRNA uiuduLt 




161 


AF062915 


Arabidopsis thaliana 
putative transcription 
factor (MYB90) 
mRNA, complete cds 


1.7 


3878065 


KIAA0077 {TR:Q 14997); 
cDNA EST yk243h8.5 comes 
from this gene; cDNA EST 
yk243h8.3 comes from this 
gene; cDNA ESTyk359h4.5 
comes from this gene 
t^aenornauuiHs eicgaiiDj 
>gi|38803 1 8|gnl|PID|e 1349839 
(Z81 133) Similarity to Human 
mKJNA product nj.aauu/ / 
(TR:Q 14997); cDNA EST 
yk243h8.5 comes from this 
gene; cDNA EST yk243h8.3 
comes from this gene; cDNA 
EST yk359h4.5 comes from this 
gene 


2.3 


162 


X87526 


H.sapiens genomic 
DNA (chromosome 
3; clone NL3003R) 


1.7 


3638957 


(AC004877) sco-spondin-mucin- 
like; similar to P98167 uncertain 
[Homo sapiens] 


2.3 i 


163 


AC005573 


Homo sapiens 
chromosome 5, PAC 
clone 202el3 


1.7 


2465540 


(AF005632) phosphodiesterase 
I/nucleotide pyrophosphatase 
beta [Homo sapiens] 


1.8 


164 


D83402 


Homo sapiens gene 
for prostacyclin 
synthase, exon 10 and 
complete cds 


1.7 


627608 


steroid hormone receptor TR3 - 
human sapiens] 




165 


AF053700 


Homo sapiens deltex 
(Dx) mRNA, 
complete cds 


1.7 


2662089 


(AB007864) KIAA0404 [Homo 
sapiens! 


1.7 


166 


AF043225 


Mus musculus 6- 
pyruvoyl- 
tetrahydropterin 
synthase (Pts) 
mRNA, complete cds 


1.7 


2352538 


(AF006564) alcohol 
dehydrogenase [Drosophila 
persimilis] persimilis] 


1.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















167 


U52917 


I nermus aquaucus 
thermophilus NADH 
dehydrogenase I 
subunits NQ07 
NQ06, NQOS, 
NQ04, NQ02, 
NQOL NQ03, 
NQ08, NQ09, 
NQO10, NQOll, 
NQ012, NQ013,and 
NQ014, complete 
cds. 


1.7 


2564334 


(AB006631) The human 
homolog of mouse Cux-2 
[Homo sapiens] 


1.0 


168 


X72222 


M.musculus gene for 
serotonin 2 receptor 


L7 


3875796 


Similarity to least 
hypothetical YIK9 protein 
(SW:YIK9_YEAST); cDNA 
EST EMBL:T01252 comes 
from this gene; cDNA EST 
EMBL:D33205 comes from this 
gene; cDNA EST 
EMBL:D33955 comes from this 
gene; cDNA EST 
EMBL:D35484 co... 


1.0 


169 


U23186 


Crotalus scutulatus 

PLA2-Iike 

pseudogene 


1.7 


853971 


(XS3413) DR5 [Human 
herpesvirus 6] >gi|853972 
(X83413) DR5 [Human 
herpesvirus 61 


0.99 


170 


M83118 


Mus musculus factor 
VHI-associated 
protein (f8a) mRNA, 
complete cds. 


1.7 


3201617 


(AC004669) hypothetical 
protein [Arabidopsis thaliana] 


0.80 


171 


M38347 


E.coli ATP- 
dependent proteinase 
(Ion) gene, complete 
cds. 


1.7 


4140322 


{AL031282) dJ283E3.12 (Cell 
Division Cycle 2-Like 2 
(PITSLRE, p58/GTA 
Galactosyl transferase 
Associated Protein Kinase)) 
(isoform beta 2-2) [Homo 
sapiens] 


0.78 


172 


U2S838 


Human transcription 
factor TFIIIB 90 kDa 
subunit 


1.7 


2495730 


HYPOTHETICAL PkOLINK- 
RICH PROTEIN KIAA0269 
>gi| 1 665805jgnI|PID|d 1014089 
(D87459) Similar to Volbox 
carter! extensin (S22697) 
[Homo sapiens] 


0.62 
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MM 

SEC 
ID 


£ Ncaresi 


_ Neighbor (BlastN vs. 


Genbank) 


J Nearest Neighbor (BlastX vs. Non-RedunHnnr Proteins) 1 


access ror 


DESCRIPTION 


P VALUE 


: 1 ACCESSION 


DESCRIPTION 


P VALUEl 


173 


U72487 


Rattus norvegicus 
calcium-independent 
alpha-latrotoxin 
receptor mRNA, 
complete cds 


1.7 


I 544411 


GLYCOPROTEIN GP100 
PRECURSOR (P29F8) 
discoideum] 


0.35 I 


174 


AE000718 


Aquifex aeolicus 
section 50 of 109 of 
the complete genome 


1.7 


J 

! 2497569 


FIBROBLAST GROWTH 
FACTOR RECEPTOR 3 
PRECURSOR (FGFR-3) 
(HEP ARIN-B INDING 
GROWTH FACTOR 
RECEPTOR) 
>gi|21I785i|pir||I55363 
fibroblast growth factor receptor 
3 - mouse >eil 199 145 flVfff I 
fibroblast growth factor receptor 
3 [Mus musculus] 


0.34 1 


175 


AFO 16897 


Oryza sativa GDP 
dissociation inhibitor 
protein OsGDI2 
(OsGDI2) mRNA, 
complete cds 


1.7 


125362 


MACROPHAGE COLONS 

STIMULATING FACTOR I 
RECEPTOR PRECURSOR 
(CSF-l-R) (FMS PROTO- 
ONCOGENE) (C-FMS) factor 1 
receptor - cat >gi| 163855 
(J03149) M-CSF receptor [Felis 
domesticus] 1 


0.34 1 


176 


U95102 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 


1.7 J 


85058 


muscarinic acetylcholine 

"eCCDtOr - fruit flv arpfvlrholir^ 

receptor [Drosophila 
nelano2aster] 


0.20 1 


177 


( 
r 

AF077352 h 


rhlamydomonas 
einhardtii myosin 
eavv chain 


1.7 jj 


\ 

( 
F 
F 

728901 h 


\cRO& r OMAL PROTEIN^- " 
10 PRECURSOR SP-10- 
vestern baboon 
>gi|298488|bbs|127113 
S5645S) SP-10=intraacrosomal 
rote in [Papio papio=baboons, 
'eptide, 285 aa] [Papio 
amadryas] 


0.20 1 


178 


C 
e 
F 
s 
[< 

Z92788 e 


.aenorhabditis 
legans cosmid 
53B8, complete 
equence 
laenorhabditis 
egans] 


1.7 1 


(I 
[( 

746516 > 


J23517) D1022.7 
laenorhabditis elegans] 
sil3258651 eleeans] | 


0.068 | 
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PCT/US00/18374 



SEC 
ID 


Nearest 

) 

ACCESSION 


Neishbor f'BIastN vs. 
A DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neieh 
ACCESSION 


bor CBIastX vs. Non-Redundant f 
DESCRIPTION 


*roteins) 
P VALUE 


179 


AF0022I7 


Raistonia eutropha 
rnegaplasmid pHGl 
nitric oxide reductase 
(norB) gene, 
complete cds 


1.7 


1 1143538 


(X87S83) mitochondrial capsul 
selenoprotein [Rattus 
norvegicus] >gi|1354135 
^ u^o / \J4.) rniLOcnonoia 
associated cysteine-rich protein 
. SMCP 


e 1 
0.039 1 


180 


D30749 


Rat mRNA for 
protein tyrosine 
phosphatase 


1.7 


1228035 


\L'od / foj i ne JviAAUiyi gene 
is expressed ubiquitously.; The 
KIAA0191 protein retains the 
C2H2 zinc-finger at its N- 
terminal region. [Homo sapiensl 


0.008 1 


181 


M15202 


Rat fast skeletal TnT 
gene encoding 
troponin T isoforms, 
complete cds. 


1.7 


731172 


SKIN SECRETORY PROTEIN 
XP2 PRECURSOR 


4e-04 J 


182 


L07592 


Human peroxisome 
proliferator activated 
receptor mRNA, 
complete cds. 


1.7 


4033414 


PUTATIVE IMPORTIN BETA- 


2e-06 I 


183 


U64031 


Dendrobium 
crumenatum ACC 
synthase gene, 
complete cds 


1.7 


3122885 


ASPARTYL-TRNA 
SYNTHETASE synthetase 
[Bacillus subtilis] 


2e-ll [ 


184 


AF034970 < 


-lomo sapiens 
docking protein 
;DOK-2) mRNA, 
:ompiete cds 


1.7 I 


2289097 | 


IU78737) 

ilpha( l,3)fucosyltransf erase 
Cricetulus griseusl 


8e-12 I 


185 


j 

t 

c 
c 

c 
I 

c 

Z12839 |c 


encoding calmodulin. 
> 

?b[L18912|LILCALM 
)DU Lilium 
ongiflorum 
almodulin mRNA, 
omplete cds. 


1.7 1 


( 

2511747 |ti 


AF023270) probable 
-anscriptional regulator dre4 


4c- 12 1 
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Nearest Neighbor (BJastN vs. Genbank) 



SEQ 

DP 1 ACCESSION 



DESCRIPTION 



186 



X53459 



Equine arteritis virus 
(EAV) RNA genome 
> :: 

emb|A45589|A45589 
Sequence I from 
Patent W09519438> 

emb|A58849|A58849 
Sequence 1 from 
Patent WO9700963 > 

gb|AR013959|AR013 
959 Sequence 1 from 
patent US 5773235 



Nearest Neighbor (BtastX vs. Non-Redundant Proteins) 



P VAJLUE 



ACCESSION 



DESCRIPTION 



1.7 



3979817 



l (g70C33) Wuikoumljiiij iu 
Human tyrosine-protein kinase 
CSK (S W:CSK_HUMAN); 
cDNA EST EMBL.-CI0908 
comes from this gene; cDNA 
EST EMBL:C 12822 comes 
from this gene; cDNA EST 
yk408c2.3 comes from this 
gene; cDNA EST yk408c2.5 ... 
Human tyrosine-protein kinase 
CSK (SW:CSK_HUMAN); 
cDNA EST EMBL:C 10908 
comes from this gene; cDNA 
EST EMBL:C12822 comes 
from this gene; cDNA EST 
yk408c2.3 comes from this 
gene; cDNA EST vk40Sc2.5 . 



P VALUE! 



Ie-14 



187 



K02668 



188 I AB008375 



189 



L36603 



E. coli ddl gene 
encoding D-alanine:D 
alanine ligase and 
ftsQ and ftsA genes, 
complete cds, and 
ftsZ gene, 5' end. 



Homo sapiens mRNA 
for osteoblast specific 
cysteine-rich protein 
complete cds 



Pseudomonas cepacia 
(clone Psudom70-1) 
heat shock protein 70 
(hsp70) gene, 
complete cds 



1.7 



3879121 



1.7 



2496945 



1.7 



2661842 



(Z703 10) predicted using 
Genefinder; Similarity to Mouse 
ankyrin (PIR Acc. No. S37771); | 
cDNA EST EMBL.T01923 
comes from this gene; cDNA 
EST EMBL.D32335 comes 
from this gene; cDNA EST 
EMBL:D32723 comes from this j 
gene; cDNA ES... Genefinder; 
Similarity to Mouse ankyrin 
(PIR Acc. No. S37771); cDNA 
EST EMBL:T01923 comes 
from this gene; cDNA EST 
EMBL:D32335 comes from this | 
gene; cDNA EST 
EMBL:D32723 comes from this | 
gene; cDNA ES... 



2c- 19. 



HYPOTHETICAL 55.9 KD 
PROTEIN EEED8.6 IN 
CHROMOSOME II >gi|733603 | 
(U23484) No definition line 
found [Caenorhabditis elegans] 



le-19 



(Y 15732) DNA polymerase beta) 
Xenopus laevis] 



6e-20 



1st 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


Z\ Nearest 

) 1 

Iaccessiot 


Neighbor (BlastN vs. 
^ DESCRIPTION 


Genbank) 
P VALUE 


1 Nearest Neigh 
1 ACPF^TDM 


bor (BlastX vs. Non-Redundant Proteins) 
_L DESCRIPTION PVATTrp 


190 


I Z49760 


P.blakesleeanus 
mRNA GTP 
cyclohvdrolase I 


1.7 


I 1731181 


HY^uiHJbriCAL /t>.5 KD 

CHROMOSOME II 
>gi|3874230|gnl|PID|el351618 
protein (Swiss Prot accession 
number P38376); cDNA EST 
yk220el0.5 comes from this 
gene [Caenorhabditis elegans] 


3e-21 I 


191 


U52428 


Human fatty acid 
synthase gene, partial 
cds 


1.7 


■ 4226073 


(AF125443) contains similarity 
to S. pombe phosphatidyl 
synthase (GB:Z28295) 
[Caenorhabditis elepansl 


6e-25 I 


192 


! U12767 


Human mitogen 
induced nuclear 
orphan receptor 


1.6 


1 <NONE> 


<NONE> 


<NONE>| 


193 


Z63478 


H.sapiens CpG DNA, 
clone 85a 12, forward 
readcpg85a!2.ftla . 


1.6 


<NONE> 


<NONE> 


<NONE> I 


194 


AF084375 


Homo sapiens 
inversin protein, 
exons 8 and 9 


16 1 


<NONE> 


<NONE> 


<NONE> j 


195 I 


AE0011I4 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1 6 1 


<lNUlNh> 


<NONE> 1 


<NONE> 1 


196 1 


] 
] 

AF084375 < 


Homo sapiens 
nversin protein, 
ixons 8 and 9 


1.6 1 


<NQNE> 


<NONE> 1 


<NONE> 1 


197 1 


I 
f 
1 

U24217 p 


Guyveromyces lactis 
IN A polymerase II 
argest subunitgene, 
artial cds 


16 I 


<NONE> 


<NONE> L 


cNONE> J 


198 | 


2 
1 

AE000580 2 


[elicobacter pylori "" 
6695 section 58 of 
34 of the complete 
enome 


1.6 1 


<NONE> 


<NONE> __[< 


:NONE> 1 
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Nearest Neighbor fRlncrNT rUnhanUt 




SEQ 
ED 


ACCESSIOf 


M DESCRIPTION 


P VALUE 


nearest iMei^r 
! ACCESSION 


icor (tiiastA vs. Non-Redundant 3 
DESCRIPTION 


^oteins) I 

p value! 


199 1 X62Q83 


H.sapiens mRNA for 
Drosophila female 
sterile homeotic 
(FSH) homologue > 
gb|M80613|HUMFS 
HG Human homolog 
of Drosophila female 
sterile homeotic 

mRNA, complete cds 
Plasmodium 


1.6, 


1 <NONE> 


<NONE> 


<NONE>| 


200 1 M28064 


brasilianum DNA 
homologous to the 
histidine-rich knob 
protein region of 
Plasmodium 
falciparum. 


1.6 


457495 


(M26647) ORE X 
[Saccharomyces cerevisiae] 


8.4 J 


201 j U03114 


Streptomyces albus 
lipase precursor (lip) 
gene, complete cds, 
and unidentified 5' 
ORE and 3* ORE, 
partial cds. 


16 I 


3638957 


(AC004877) sco-spondin-mucin- 
like; similar to P98167 uncertain 
[Homo sapiens] 


7 8 j 


202 1 U8849? 


Strix varia oocyte 
maturation factor 
Mos (c-mos) proto- 
oncogene, partial cds 


1.6 


137618 


VITAMIN D3 RECEPTOR 
\ v UK) receptor [Kattus 
norvegicus] 


64 1 


I 1 
1 ( 

203 J M68519 < 


Human pulmonary 
surfactant- associated 
protein SP-A 
'SFTPl)gene, 
romplete cds. 


1.6 J 


( 

3875423 f 


238112) E03A3.6 
Caenorhabditis elegans] 


4.9 j 


[ I } 
l 1 t 

204 1 AF044575 I 


-lomo sapiens 
ranscription factor 
OU4F3 


16 I 


C 

2133625 t 


j ABA transport protein - 
Dbacco horn worm 


47 1 


1 1 1 
1 ( 

1 F 

205 I L4S476 s 


lomo sapiens 
subclone 3_el0 from 
>1H21)DNA 
equence. 


1.6 1 


C 

3687297 s 


AJ005588) 5-epi-aristolochene 
vnthase 


4.6 I 


1 R 

1 1 n 

1 206 | Ml 8630 p 


.at CNS 2\3'-cyc!ic 
ucleotide 3- 
hosphodiesterase 


1.6 1 


0 
rr 

0 

3880315 el 


^81 133) Similarity to Human 
lRNA product KIAA0077 
rR:Q 14997) [Caenorhabditis 
egans] 


3.7 j 
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Nearest Neighbor (BlastN vs. Gcnbank) 



SEQ 

jD| ACCESSION 



207 1 AFQ27174 



208 | U53448 



209 | AF084367 



210 



D55635 



DESCRIPTION 



Nearest Neighbor (BiastX vs. Non-Redundant Proteins) 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



Babesia microti heat 
shock protein 70 
(hsp70) gene, 
complete cds ■ 



ACCESSION 



DESCRIPTION 



P VALUE I 



1.6 



267068 



1.6 



1255429 



TUMOR-ASSOCIATED 
ANTIGEN L6 



Homo sapiens 
nversin protein 
mRNA, complete cds 



Yeast disl+ gene for 
p93disl, complete 
cds 



211 | AF035756 



212_I_X73479 



213 1 X98330 



Streptomyces sp. 2- 
dehydro-3- 
deoxyphosphohepton 
ate aldolase gene, 
partial cds 



Ocuniculus rPTPA 
mRNA 



1.6 



1730076 



1.6 



3128353 



1.6 



853971 



H.sapiens mRNA for 
ryanodine receptor 



2141X64194 



292788 



P.anserina FMRI 
ene exons 1 an d 2 
aenorhabditis 

elegans cosmid 

F53B8, complete 

sequence 

[Caenorhabditis 

elegans] 



216 



Methanobacterium 
thermoautotrophicum 
from bases 1098908 
to 1112186 (section 
94 of 148) of the 
AE 0008 8 8 [complete genome 



1.6 



3413810 



(U53155) strong similarity to 
the carboxyl two-thirds of valyl- 
tRNA synthetases 
[Caenorhabditis etegansl 



3.6 



PROBABLE 
SERINE/THREONINE - 
PROTEIN KINASE CY49.28 
>gi|1370255|gnl|PID|e247094 
(Z73966) pknJ 



(AF0I0496) maltose transport 
inner membrane protein 



(X83413) DR5 [Human 
herpesvirus 6] >gi|853972 
(XS3413) DR5 [Human 
herpesvirus 6] 



(Y17034) Bassoon [Mus 
musculus] 



1.6 



2072986 



1.6 



128014 



1.6 



(U95142) putative G-protein- 
oupled receptor G-protein- 
oupied receptor [Arabidopsis 

thaliana] 



746516 



1.6 



462415 



2.2 



1.2 



1.2 



0.97 



0.94 



NECDIN >gi|9i 129|pir||JN0148 j 
necdin, brain - mouse 
>gi|200020 (M80840) necdin 
[Mus musculus] j 0.42 



(U23517)D1022.7 
Caenorhabditis elegans] 
>gi|3258651 elegans] | 0 19 

INTERFERON- ALPHA/BETA 
RECEPTOR ALPHA CHAIN 
PRECURSOR ( IFN- ALPHA - 
REC) >gi|346520|pir||S27387 
interferon alpha receptor type 1 
bovine >gi|432 j 0.001 
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Nearest Neighbor fBlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



217 



218 



DESCRIPTION 



P VALUE 



ACCESSION 



AB008375 



M25312 



Homo sapiens mRNA 
for osteoblast specific 
cysteine-rich protein, 
complete cds 



Orang-utan involucrin 
gene, complete cds 



1.6 



2496945 



1.6 



3875131 



DESCRIPTION 



HYPOTHETICAL 55.9 KD 

PROTEIN EEED8.6 IN 
CHROMOSOME n >gi|733603 
(U23484) No definition line 
found [Caenorhabditis elegans] 



(Z70750) similar to vanadate 
resistance protein 
transrnembranous domains 
Caenorhabditis elegans] 



P VALUEl 



le-18 



219 



AB012882 



Cyprinus carpio 
mRNA for MyoD, 
complete cds 



1.5 



220 



U29487 



Caenorhabditis 
elegans cosmid 
C09C7 



<NONE> 



<NONE> 



1.5 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



221 



X74760 



M.musculus mRNA 
for Notch 3 



1.5 



1364094 



integral membrane protein - 
Streptomyces pristinaespiralis 
>gi|872306 (XS4072) integral 
membrane protein 



222 



U72396 



Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA, complete cds 



1.5 



121855 



hXOuLUCANASE 11 
PRECURSOR cellulose 1,4-beta 
cellobiosidase (EC 3.2.1.91) II 
precursor - fungus (Trichoderma 
reesei) 1,4-beta-ceIlobiosidase 
(EC 3.2.1.91) II - fungus 
celiobiohydrolase II 
Trichoderma reesei] 



4.3 



4.3 



223 



U42391 



Human myosin- Kb 
mRNA, complete cds 



1.5 



3688428 



(AJ011534) sucrose synthase 



224 



M92296 



Pongo pygmaeus 
gamma- 1 and gamma 
2 globin genes, 
complete cds. 



1.5 



186413 



(Ml 3 144) inhibin A [Homo 
sapiens] 



0.22 



225 



C.japonica mRNA for 
X94144 ONR-71 protein 



1.5 



2745737 



(AF029791) UDP- 
GalrbetaGlcNAc beta 1,3- 
galactosyltranferase-II [Mus 
musculus] 
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HSfeggj Nearest Neighbor (BlastN vs. G enbank) 
SEQ 

DESCRIPTION | P VALUE 



ID 1 ACCESSION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



226 I AB014557 



227 I AF00Q948 



228 1 AF057287 



229 I U38951 

230 1 AFQ27148 



Homo sapiens mRNA| 
for KIAA0657 
protein, partial cds 



Borrelia burgdorfen 
oligopeptide 
permease homolog 
OppAIV (oppAIV) 
gene, complete cds 



1.5 



1212992 



1.3 



Mus musculus 
RAB/Rip protein 
rnRNA, panial cds 



<NONE> 



DESCRIPTION 



(X90568) Protein sequence anc 
annotation available soon via 
Swiss- Prot; available at present 
via e-mail from 
LABEIT@EMBL- 
Heidelberg.DE fHomo sapiens] 



<NONE> 



Drosophila 
melanogaster 
vacuolar ATPase 
subunit E 
Homo sapiens 
myogenic 

determining factor 3 



1.3 



2498005 



1.1 



1.1 



<NONE> 



3172134 



MYC PROTO-ONCOGENE 
PROTEIN (C-MYC) proto- 
oncogene [Sus scrofal 



<NONE> 

(U90209) RNA polymerase II 
largest subunit [Bonnemaisonia 
hamifera] 



P VALUE I 



4e-13 



<NONE> 



2.6 



231 1 AF07931Q 



Mus musculus histonej 
deacetylase 3 
(Hdac3) gene, exons 
4 through 15 and 
complete cds 



232 | X52134 



233 I D89016 



234 | X76392 



P.radiata lac gene for 
laccase 



1.0 



1657601 



Human rnRNA for 
Neuroblastoma, 
complete cds 



0.95 



996020 



235 I API 00694 



C.familiaris VIP36 
(vesicular integral- 
membrane protein of 
36 kDa) rnRNA 



0.93 



<NONE> 



0.93 



Mus musculus 
Pontin52 rnRNA, 
complete cds 



4176446 



0.90 



<NONE> 



(U66220) unknown 
[Nannocystis exedens] 



(X91638) BRM protein [Gallus 
gallus] 



<NONE> 



(AL022238) dJ1042K10.2.1 
(novel protein with probable 
rabGAP domains and Src 
homology domain 3) 



0.25 



0.31 



<NONE> 



<NONE> 



7e-81 



<NONE> 
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SEQ 
ID 


I Nearesi 

Iaccessioi 


Neighbor (BlastN vs. 
M DESCRIPTION 


Genbank) 
P VALUE 


J Nearest Neipr 
: 1 ACCESSION 


lbor (BlastX vs. Non-Redundant 1 
1 DESCRIPTION 


^oteins) 1 
P VALUE 1 


236 


AE000991 


Archaeoglobus 
flilgidus section 1 16 
of 172 of the 
complete genome 


0.90 


1 1176579 


bo l l rKU 1 ELN PRbCUKSTJ 

>gi|1362345|pir||S55862 
probable membrane protein 
YNL327w - yeast 
i^jaLtiidj vjiiiytes cerevisiae/ 
cerevisiae] 

>gi| 1302445|gnl(PID|e239572 
(Z71603) ORF YNL327w 
m [Saccharomyces cerevisiae] 


k 1 

69 1 


237 1 


Z35922 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR053c 


0.86 


1 <NONE> 


<NONE> 


<NONE>| 


238 I 


U47331 


Rattus norvegicus 
metabotropic 
glutamate receptor 4b 
mRNA, complete cds. 


0.82 


' 1550703 


(Z80225) hypothetical protein 
Rv2662 


4.1 1 


239 I 


X72810 


H.sapiens Ig germline 
kappa-chain gene 
variable region (L3) 


0.69 


3023063 


(AF052587) F14 [Xylella 
fastidiosa] 


6.7 1 


240 I 


J 

Z11700 a 
j 


Escherichia coli 
genes faeG, faeH, 
faeI T faeJ and IS629- 
Iike insertion 
sequence. > :: 
imb|Z11710|ECFAE 
4IJ E.coli faeH, fael 
ind faeJ genes 
incoding FaeH, Fael 
tnd FaeJ proteins 

'h rvrmcomi 


0.69 1 


( 

2347188 r 


AC002338) laccase isolog 
Arabidopsis thaliana] thalianal 


3.9 1 


241 1 


d 
d 

s 

g 
g 

rr 

U71597 p 


ouglassii NADH 
ehydrogenase 
ubunit 4 (ND4) 
ene, mitochondrial 
ene encoding 
litochondrial 
rotein, panial cds 


0.65 1 


<NONE> 


<NONE> < 


NONE> 1 
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SEQ 

g> I ACCESSION 



Nearest Neighbor (BiastN vs. Genbank) I Nearest Ne ighbor (BiastX vs. Non-Redundant Prote in 



DESCRIPTION 



242 



243 



Z77798 



D25542 



244 



M80234 



Ammonia species 
LSUrRNA gene 
(partial; isolate Tr S 
5; clone 16) 

Human raRNA for 
golgi antigen gcp372. 
complete cds 



Cow dopamine 
transporter mRNA, 
putative cds. 



245 I AB007918 



246 I X51754 



Homo sapiens mRNA 
for KIAA0449 
■rotein, pania l cds 
Human U266 
rearranged DNA for 
lambda- 

immunoglobulin light 
chain 



247 | AE00I554 



248 | Z64067 



249 I AJ223768 



Helicobacter pylori, 
strain J99 section 1 15 
of 132 of the 
complete genome 

H.sapiens CpG DNA. 
clone 96e7, reverse 
read cpg96e7.rtla . 



P VALUE I ACCESSION 



DESCRIPTION 



0.64 



0.64 



1174506 



111230 



CjLUI AMVL-1RNA 

glutamate- 
tRNA ligase (EC 6.1.1.17)- 
Haemophilus influenzae (strain 
Rd KW20) >gi|1573240 
(U32713) glutamyl-tRNA 
synthetase (gltX) [Haemophilus 
influenzae Rd] 



ultra-high-sulfur keratin 1 
mouse 



P VALUE 



0.64 



3874972 



0.64 



0.63 



2833239 



2072301 



Pinus sylvesrris 
microsatellite DNA, 
clone SPAC 11.5 



0.62 



0.62 



<NONE> 



<NONE> 



0.62 



<NONE> 



(Z99709) similar to Elongation 
factor Tu family (contains 
ATP/GTP binding P-loop); 
cDNA EST EMBL:D76223 
comes from this gene; cDNA 
EST yk478c5.5 comes from this 
:ene [Caenorhabditis elegans] 



1.2 



le-05 



■PiDlikMAL GR0WT 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|530823 (U12535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 



(U95102) mitotic 
phosphoprotein 90 [Xenopus 
laevis] 



8e-06 



<NONE> 



<NONE> 



<NONE> 



2e-I4 



1.5 



<NONE> 



<NONE> 



<NONE> 




WO 01/02568 
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SEC 
ID 


£ Neares 
1 

ACCESSIOI 


t Neighbor (BlastN vs. 
V DESCRIPTION 


Genbank) 
P VALUE 


| Nearest Neipi 
: 1 ACCESSION 


lbor (BlastX vs. Non-Redundant 1 
DESCRIPTION 


^roteins) j 

p value! 


250 


AJO 11592 


Bacteriophage PI ba 
gene 


n 




PHOTOSVSlbMU 10 Kb 
PHOSPHOPROTEIN deltoides 
>gi|2 143326|gni|PID|e3 19090 
(Y13328) lOkDa 
phosphoprotein [Populus 
deltoides] 


7.9 J 


251 


AF027151 


5cenopus laevis " — " 
survival of motor 
neuron protein 
interacting protein 1 
(SIP1) mRNA, . 
complete cds 


0.62 


I 4007790 


(AL034463) putative single- 
strand polynucleotide binding 
protein [Schizosaccharomyces 
pombe] 


20 1 


252 


AJ000376 


Helobdella triserialis 
mRNA for actin 


0.62 


j 1117968 


(U40763) CARS-Cyp [Homo 
[sapiens] sapiens] 


0.90 1 


253 


M69231 


Rat thymosin beta 4 
gene (pTB4G).intron. 


0.62 


1 4176370 


ha^uldldb; similar to calcium- 
Jindependent phospholipase A2; 
similar to AC004392 
(PID:g3367519) [Homo 
[sapiens] 


6e-51 1 


254 


AB021638 


Homo sapiens XI 1L2 
mRNA for XI l-Iike 
protein 2, complete 
cds 


0.61 


<NONE> 


<NONE> 


<NONE> J 


255 


D26470 


Bacteroides 
gingivalis DNA for 
arginyl 

endopeptidase, 
complete cds 


0.61 


<NONE> 


<NONE> 


<NONE> I 


256 


J04737 


A.thaliana ATPase 
gene, complete cds. 


0.61 1 


<NONE> I 


<NONE> 


<NONE> I 


257 


1 
i 

U06756 


Bos taurus clone 
bml308 

■nicrosateliite and are- 
[d reDeat region 


u.vji § 


< 

1922280 


: Y09905) snail like protein 
Gallus aallus] 


0.51 1 


258 


F 

c 
a 

S75756 C 


>15=cyclin D- 
iependent kinases 4 
md 6-binding 
>rotein/pl5 product 
exon/intron 1 } 
luman, brain tumors, 
ienomic, 753 nt] 


0.61 I 


r 

s 

484938 p 


ypothetical protein 253 - 
treptomyces griseus plasmid 
SGI (fragment) 


0 13 1 


259 


n 

s 
rr 

L39837 C ( 


>rosophila 
lelanogaster tumor 
Jpressor (warts) 
iRNA exons 1-8, 
Dmplete cds. 


0.61 


C 
rt 
tr 

3875131 f( 


770750) similar to vanadate 
isistance protein 
ansmembranous domains 
Zaenorhabditis elegans) 


le-09 j 
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SEC 
ED 


V Nearest 
> 

ACCESSION 


Neighbor (BlastN vs. 
DESCRIPTION 


Genbank) 
P V ALIJP 


Nearest Neiph 
ACLtoMON 


bor (BlastX vs. Non-Redundant I 
DESCRIPTION 


*roteins) 
P VALUE 


260 


1 U52428 


Human fatty acid 
synthase gene, partia 
cds 


1 

0.61 


1 4226073 


(AF 125443) contains similarity 
to S. pombe phosphatidyl 
synthase (GB:Z28295) 
[Caenorhabditis elepansl 




261 


j Plasmodium 
1 1 falciparum gene for 
I [heat-shock protein 
1 X15292 P Pf203 


0.60 


J <NONE> 


<NONE> 


2e 26 
<NONE> I 


262 


j AB020663 


Homo sapiens mRNA 
for KIAA0856 
protein, partial cds 


u.ou 


1 470341 


(U00043) No definition line 
found [Caenorhabditis elecansl 


5.7 1 


263 


J 

ft 

U68723 c 


luman checkpoint 
oppressor 1 mRNA, 
romplete cds 


0.60 


544375 


IjALACTOSE-HINDING " 
PROTEIN REGULATOR 
gl ucose/galactose binding 
protein regulator - 
Agrobacterium tumefaciens 
>gi|142228 (LI 0424) 
glucose/galactose binding 
protein regulator 


5 7 J 


264 I 


iS.griseus sporulation 
(protein genes 1590 
M32687 and 1422. 


0.60 [ 


2582017 


(AF0I2871) Mergla* [Mus 
musculus] 


3.3 J 


265 J 


(Homo sapiens 
INKCC2 gene, exon 4, 
AJ005331 isoformB 


0.60 


3128353 


(AF0 10496) maltose transport 
inner membrane protein 


1.5 J 


266 I 


Mus musculus RGL 
(protein mRNA, 
U14103 complete cds. 


0.60 


< 

4099845 i 


U90533) serine protease 
nhibitor [Strep torn yces fradiael 


0.098 j 


267 J 


U95094 


lAenopus laevis XJL- 

INCENP (XL- 
JlNCENP) mRNA, 

complete cds 


0.59 J 


( 

3282851 I 


AF047897) ankyrin-like protein 
IGE-ANK [Ehrlichia sp. BDS1 


55 I 


268 1 


AE 000872 


Methanobacterium 
thermoautotrophicum 
from bases 896604 to 
912784 (section 78 of 
148) of the complete 
genome 


0.59 1 


h 
P 

401553 II 


IYPOTHETICAL 24.5 KD 
ROTEIN IN NADB-SRMB 
STTERGENIC REGION 


4-3 1 



I(f7 
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gjag[| Nearest Neighbor (BlastN vs. Genbank) 
SEQ 

DESCRIPTION 



U> I ACCESSION 



P VALUE 



269 



LI 1871 



Gallus gallus achaete 
scute homologue 
(ASH) mRNA, 
complete cds. 



0.59 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



628110 



tteTpgsrmn 4 leading f r ame r 

[Human herpesvirus 4] 2 
[Human herpesvirus 4] 
>gi|1334838|gnl|PH>|e25079 4 
'Human herpesvirus 4] 
>gi|1334840|gnI|PID|e25081 6 
[Human herpesvirus 4] 
gi|1334842|gnl|PID|e25067 8 

[Human herpesvirus 4] 

>gi|1334844|gnl|PID|e25069 10 

[Human, herpesvirus 4] 

>gi|1334846|gnl|PID|e25071 12 

[Human herpesvirus 4) 



P VALUE 



270 | AF017114 



Oryctolagus 
cuniculus glycogen 
synthase mRNA, 
complete cds 



0.59 



728856 



271 I AF027807 



272 | U81787 



273 I U76036 



Homo sapiens beta- 
casein (CSN2) gene, 
complete cds 



0.59 



3252932 



Human WntlOB 
mRNA, complete cds 



Apteryx australis 16S 
ribosomal RNA gene 
mitochondrial gene 
for mitochondrial 
RNA, partial 
sequence 



0.59 



3875538 



274 I AB014564 



275 I AF044171 



Homo sapiens mRNA 
for KIAA0664 
protein, partial cds 



Homo sapiens cyclin 
dependent kinase 
inhibitor 2D 
(CDKN2D) gene, 
partial cds 



NITROGENASE IRON-IRON 
PROTEIN ALPHA CHAIN 
(NITROGENASE 
COMPONENT I) 
(DINITROGENASE) capsulatusl 
>gi|3 12238 (X70O33) 
alternative nitro senase 



(AF067 1 55) truncated rev 
protein [Human 
mmunodeficiency virus type 11 



(Z67990) similar to cuticle 
collasen 



0.59 



4193356 



0.59 



1709851 



0.59 



3925213 



(AF055088) ATP-binding 
cassette; PsaB [Streptococcus 
pneumoniae] 



2.4 



1.5 



P TB- ASSOCIATED SPLICING| 
FACTOR (PSF) long form - 
human >gi|38458 (X70944) 
PTB -associated splicing factor 
Homo sapiens] 



0.83 



(AL032626) Y37D8A.17 
[Caenorhabditis elegans] 



0.17 



3e-10 
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PCT/US00/18374 



SEQ 
ED 


i.\ Nearest 
1 

JACCESSIOr 


Neighbor (BlastN vs. 
vf DESCRIPTION 


Genbank) 
| P VALUE 


Nearest Neigr 
ACCESSION 


ibor (BlastX vs. Non-Redundant I 
DESCRIPTION 


totems) ] 

p value! 


276 


I L19640 


Saccharomyces 
cerevisiae cdc2/cdc2 
related protein kinase 
gene, complete cds. 


8 

0.59 


3880115 


(Z81130) T23G1I.9 
[Caenorhabditis elegans] 


le-21 I 


277 


1 Z80999 


Human DNA 
sequence from 
cosmid E140G5 on 
chromosome 22, 
complete sequence 
Homo sapiens] 


0.58 


<NONE> 


<NONE> 


<NONE> J 


278 


Y11108 


H.sapiens WNT8B 
gene 


0.58 


<NONE> 


<NONE> 


<NONE> J 


279 


U80001 


Sphyraena idiastes 
lactate dehydrogenase 
A 


0.58 


<NONE> 




<NONE> J 


280 


Z49637 


S.cerevisiae 
chromosome X 
reading frame ORF 
YJR137c 


0.58 


<NONE> 


<NONE> 


1 <NONE> J 


281 


X64467 


H.sapiens ALAD 
gene for 

porphobilinogen 
synthase 


0.58 


<NONE> 


<NONE> J 


<NONE> I 


282 


X74506 


G.gallus hox B3 
mRNA 


0.58 


<NONE> 


<NONE> 1 


<NONE>| 


283 I 


U68040 


Cochliobolus 
heterostrophus 
polyketide synthase 


0.58 


<NONE> 


<NONE> I 


<NONE> 


284 I 


AF089084 i 


Arabidopsis thaliana 
putative auxin efflux 
carrier protein (PIN1) 
nRNA. complete cds 


0.58 


- 

<NONE> 


<NONE> 1 


<NON£> 1 


285 [ 


] 
I 

U38481 c 


^attus norvegicus 
*OK-alpha mRNA, 
romplete cds 


0.58 


<NONE> 


<NONE> 1 


<NONE> 1 


286 J 


I 
F 

AFO 17656 r 


iomo sapiens G 
>rotein beta 5 subunit 
nRNA, complete cds 


0.58 


( 

3236249 p 


AC004684) hypothetical 
rotein [Arabidopsis thalianal ! 


9.2 J 


287 1 


I 

ti 

M96234 n 


luman glutathione 
ransferase class mu 
umber 4 


0.58 


( 
c 

1280073 e 


U55366) Similar to cuticle 
ollagen [Caenorhabditis 
legans] 


7.1 


288 I 


h 

AB002339 p 


luman mRNA for 
1IAA0341 gene, 
anial cds 


0.58 


0 
P 

861293 el 


J28741) F35D2.1 gene 
roduct [Caenorhabditis 
egans] | 


7.1 1 
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SEC 
ID 


£| Neares 
> J 

[ACCESSION 


t Neighbor (BlastN vs. 
sj DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neipf 
a ACCESSION 


lbor (BlastX vs. Non-Redundant ] 
. DESCRIPTION 


Proteins) 
P VALUE 


289 


J U 11295 


Neisseria 
gonorrhoeae 
carbamoyl phosphate 
synthetase 
(glutamine) small 
subunit (carA) and 
large subunit (carB) 
genes, complete cds. 


0.58 


1 2425135 


(AF020283) DG2044 gene 
product [Dictyostelium 
discoideum] 


5.3 1 


290 


1 D8QQQ1 


Human mRNA for 
KIAA0179 gene, 
partial cds 


0.58 


1 4097223 


(U49836) gamma-glutamyl 
transpeptidase precursor [Brugiz 
_ malayi] 


i 1 

4.1 1 


291 


Z11700 


Escherichia coli 
genes faeG, faeH, 
fael, faeJ and IS629- 
like insertion 
sequence. > :: 
emb|211710|ECFAE 
HIJ E.coli faeH, fael 
and faeJ genes 
encoding FaeH, Fael 
and FaeJ proteins 


0.58 


2347188 


(AC002338) laccase isolog 
^Arabidopsis thaliana] thalianal 


3.2 1 


292 


M77350 


Mouse hair keratin 
A I (MHKAl)gene, 
complete cds. 


0.58 


141165 


HYPOTHETICAL 8.3 KD 
PROTEIN >?i|62 179 


32 1 


293 I 


X63787 


T.thermophila gene 
for snRNA U3-2 


0.58 


2826900 


(AB004461) DNA polymerase 
alpha catalytic subunit [Oryza 
sativa] 


3.1 1 


294 I 


D63881 i 


Human mRNA for 
KIAA0I60 gene, 
Dartiai cds 


0.58 I 


I 

1934730 1 


;U95036) germin-like protein 
"Arabidopsis thalianal 


3.1 1 


295 1 


i 
r 
r 

e 
r 

U39378 p 


jymnoc arena 
nexicana 16S 
ibosomal RNA gene, 
nitochondnal gene 
encoding 

nitochondnal RNA, . 
artial sequence 


058 I 


( 

2194131 S 


AC002062) Similar to 
^ynechocystis antiviral protein 


3.1 J 


296 1 


F 
> 
d 
E 
P 

X87987 f r 


'.pastoris PRC1 gene 

bj|EI2103|E12103 
>NA encoding 
recursor of protease 
om Pichia pastoris 


0.58 1 


C 

(1 

it 
> 

3914197 0 ( 


>CCLUDIN >gi| 1276983 
J49221)occludin [Canis 
imiliaris] 

gi| 1589 1 8 1 |prf]|22 1 0347O 
:cludin [Canis familiaris] 


3.1 1 



Ho 
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SEC 
ED 


Nearest 

> 

ACCESSIC* 


Neighbor (BlastN vs. 
J DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant f 
DESCRIPTION 


>roteins) ] 
P VALUE! 


297 


i X^5782 


A.thaliana (L.Heynh. 
chloroplast mRNA 
for recombinant APS 
kinase 


) 

0.58 


I 1732444 


(D38529) DRPLA protein 
[Homo sapiens] 


2.4 


298 


M64848 


" Mouse platelet- 
derived growth factor 

rs r*P5JiTl miicniltic 
i_> t Jiaill III LloCUIUo 

platelet-derived 
growth factor beta- 
chain (sis) gene, exon 
5. 


0.58 


1 3025832 


(AF055985) pyrrolidone-rich 
antigen [Onchocerca volvulusl 


1.4 I 


— y 7 




Helicobacter pylori, 
strain J99 section 21 
of 132 of the 
complete genome 


0.58 


f 2827198 


(AF037454) ubiquitin protein 
ligase [Mus musculusl 


1.1 I 


300 


X65720 


M.musculus gene for 
protein kinase C- 
gamma (exonl and 
exon 2) 


0.58 \ 


418395 


LUDi PROTEIN 

>gi|320737|pir||S30818 
hypothetical protein YER164w - 
yeast (Saccharomyces 
cerevisiae) >gi|603404 
(U18917) Chdlp: transcriptional 
regulator [Saccharomyces 
cerevisiae] 


1.1 


301 


r\r\JHJ> l JU 


Arabidopsis thaliana 
lactate dehydrogenase 


0.58 I 


3024637 


SEX- DETERMINING 
REGION Y PROTEIN 
determining protein [Mus 


0.62 J 


302 


D28116 


Human genes for 
collagen type IV 
alpha 5 and 6, exon 1 
and exon 1' 


0.58 I 


1458250 


(U64835) T09D3.3 
[Caenorhabditis eleaans] 


0.36 1 


303 


AE001075 


Archaeoglobus 
fulgidus section 32 of 
1 12 or the complete 
senome 


0.58 


( 

2276333 1 


[Z97991) hypothetical protein 
*v0336 


0.36 1 


304 


] 

c 
c 
t 

AF003948 c 


R.hodococcus opacus 
:hloromuconate 
rycloisomerase 
ransposase homolog 
tenes, complete cds 


0.58 I 


r 

477072 h 


nucin 7 precursor, salivary - 
urnan 


0.28 J 


305 


I 

a 

F 

U10692 c 


iuman MAGE -7 
ntigen (MAGE7) 
seudogene, complete 
ds. 


0.58 ! 


f- 

3287858 C 


IOMEOBOX PROTEIN HOX- 
'11 


0.054 1 



\ 1 < 
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mm 

ID 


Nearest 
ACCESSIOI^ 


Neishbor (BlastN vs. < 
J DESCRIPTION 


Uenbank) 
P VALUE 


Nearest Neiph 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Toteins) 1 
P VALUEl 


306 


AF003948 


Rhodococcus opacus 
chloromuconate 
cycloisomerase 
transposase homolog 
genes, complete cds 


0.58 


3551821 


(AF058803) mucin 4 [Homo 
sapiens] 


0.041 1 


307 


X99350 


H.sapiens HhH4 
gene, exon 1 and 
joined CDS 


0.58 


| 137483 


VAV PROTO-ONCOGENE 
>gi|55221 (X64361) proto- 
_ oncogene [Mus musculus] 


0.024 1 


308 


AJ234282 


Homo sapiens mRNA 
for Ig heavy chain 
variable region, clone 
C 


0.58 


| 3264846 


(AC003682) R27945_2 [Homo 
sapiens] 


0.018 1 


309 


AF079310 


Mus musculus histone 
deacetylase 3 
(Hdac3) gene, exons 
4 through 15 and 
complete cds 


0.58 1 


1657601 


(U66220) unknown 
[Nannocystis exedens] 


0.014 


310 


AF019367 


Human thiopurine 
methyl transferase 
(TPMT) gene, exons 
6 and 7 


0.58 


3283352 


(AF063020) lens epithelium- 
derived growth factor [Homo 
sapiensl 


0.011 1 


311 


X65720 


M. musculus gene for 
protein kinase C- 
gamma (exonl and 
exon 2) 


0.58 


1790878 


(U38291) microtubule- 
associated protein la [Homo 
sapiens] 


0.008 


312 


AB011155 


Homo sapiens mRNA 
for KIAA0583 
protein, partial cds 


0.58 | 


1351166 


SYNAPSINS IA AND IB 
>gi|163713 


0.006 


313 


X63692 


H.sapiens mRNA for 
DNA 


0.58 1 


1817548 


(D84307) phosphoethanolamine 
-ytidylyltransferase [Homo 
sapiens] 


0.001 I 


314 


i 
> 
1 
F 

U53746 


mrnunodeficiency 
/irus isolate FIV- 
*co336-8 pol 
>olyprotein (pol) 
jene. partial cds 


0.58 


( 

2246532 I 


U93872) ORF 73, contains 
arge complex repeat CR 73 


2e 05 I 


315 


( 

K00436 C 


tattus norvegicus 
clone rtl-1) pseudo- 
jIv-tRNA eene. 


0.58 1 


( 

206712 p 


M64793) salivary proline-rich 
rotein [Rattus norvegicus] 


le-05 1 
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PCT/US00/18374 



SEC 
ID 


V, Neares 
) 

ACCESSIOI 


t Neighbor (BlastN vs. 
s\ DESCRIPTION 


Genbank) 
P VALUE 


Nearest Nei^l 

r\ \^ tz, o o i\J IN 


ibor (BiastX vs. Non-Redundant 
| DESCRIPTION 


Proteins) 1 
P VALUE 1 


316 


1 / 

1 S79632 


HSF2=heat shock 
factor 2 {alternative! 
spliced, splice 
junction region} 
[mice, CBA/J, testis, 
Genomic, 120 nt. 
segment 2 of 3] 


y 

0.58 


4038594 


(AJ222798) tDETl protein 
[Lycopersicon esculentum] 


3e-06 1 


317 


D43964 


Rat liver mRNA for 
Kan-1, complete cds 


0.58 


1280135 


(LCoJ V6J coded tor by (J. 
elegans cDNA cm21e6; coded 
for by C. elegans cDNA 
cm01e2; similar to melibiose 
carrier protein 

(thiomethylgalactoside permeas< 
ID 


le-08 J 


318 


AB007918 


Homo sapiens mRNA 
for KIAA044.Q 
protein, partial cds 


0.58 


2833239 


HPIDERMAL GROWTH 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|530823 (U12535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 


3e -13 1 


319 


AB001466 


Homo sapiens mRNA 
for Efsl, complete 
cds 




2943716 


(D45027) 25 kDa trypsin 
inhibitor [Homo sapiens] 


2e-14 I 


320 1 


ZU701 


Saccharomyces 
cerevisiae ERE1 gene 
for putative protein 
kinase. 


0.58 


3880115 


(Z81130) T23G11.9 
[Caenorhabditis elesans] 


9e-21 1 


321 


i 

Z49535 


chromosome X 
-eading frame ORP 
V7R035w 


U.jo 


4106562 < 


(Z83819) dJ146H2i.2 (similar 
to CYTOCHROME B-245 
HEAVY CHAIN) [Homo 
sapiens] 


3e -33 I 


322 1 


< 

M625Q6 f 


S.cerevisiae DBF20 
?ene, complete cds. 


0.57 


<NONE> 


<NONE> ' 


<NONE>| 


323 I 


p 

X05944 s 


feast PSS gene for 

Jiiu^piiaLlU y loCl 11 1C 

ynthetase 


0.57 


<NONE> 


<NONE> 


<NONE> 1 


324 I 


S 
r 

D38536 c 


nail gene for ADP- 
ibosyl cyclase, 
omplete cds 


0.57 


<NONE> 




-\Tr\X IT"'- f 


325 1 


S 
c 
rc 

Z75004 Y 


.cerevisiae 
hromosome XV 
jading frame ORF 
OR096w 


0.57 


<NONE> 


<NONE> < 


:NONE> | 
NONE> I 


326 1 


H 

(s 
fr 

L77034 |se 


omo sapiens 
ubclone 10_el0 
om Pi HI 6) DNA 
quence. 


0.57 


<NONE> 


cNONE> < 


NONE>| 



73 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID I ACCESSION I DESCRIPTION 



327 | D37887 



328 | ABQI4562 



329 



Z69651 



330 



D89285 



Cyprinus carpio c- 



myc gene for c-Myc, 
complete cds 



Homo sapiens mRNA 
for KIAA0662 
rotein, partial cds 



protein, part 
Human DN/ 



uman 
sequence from 
cosmid L75B9, 
Huntington's Disease 
Region, chromosome 
4pl6.3 



331 I Z48951 



332 



X95573 



333 



U95094 



Mesocricetus auratus 
mRNA for inter-alpha 
trypsin inhibitor 
heavy chain 1, 
complete cds 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE 



ACCESSION 



DESCRIPTION 



0.57 



<NONE> 



<NONE> 



P VALUE 



0.57 



(M57576) Ig kappa chain [Mus 
197406 Imusculusl 



0.57 



1079280 



chaperonin containing TCP-1 
[complex gamma chain - African 

clawed frog >gi|793886 
[(X84990) Ccta 



S.cerevisiae 
chromosome XVI 
cosmid 9723 



A.thaliana mRNA for 
salt-tolerance zinc 
finger protein 



0.57 



RYANODINE RECEPTOR, 
134132 [SKELETAL MUSCLE 



<NONE> 



(AJ 130783) APC2 protein [Mus 
0.57 I 4210432 musculus] 



0.57 



1174828 



TYROSINE 
DECARBOXYLASE 2 
4. 1 . 1 .25) - parsley >gi| 1 6967 1 
(M96070) tyrosine 
decarboxylase [Petroselinum 



Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 



334 IaE 00 11 16 



335 



Z3429 1 



Borrelia burgdorferi 
(section 2 of 70) of 
the complete genome 



R.norvegicus mRNA. 
for putative chloride 
channel. 



0.57 



465646 



PkoBAliLL: ABC 
I TRANSPORTER ATP- 
BINDING PROTEIN IN 
NTRA/RPON 5'REGION 
(ORF1) Azorhizobium 
caulinodans >gi|3 11388 
|(X69959) ORF1 



0.57 



2314735 



( AE000653) Na+/H+ antiporter 
(nhaA) [Helicobacter pylori 
[26695] 



0.57 



1350832 



DNA-blkLClLD RNA 

POLYMERASE I SECOND 
LARGEST SUBUNIT (RNA 
POLYMERASE I SUBUNIT 2) 
chain RPA2 - Euplotes 
pctocarinatus (SGC9) 
J>gij578407 octocarinatus] 



8.9 



8.9 



6.9 



5.3 



5.2 



4.0 



4.0 



3.0 
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SEC 
ID 


p Nearest 
ACCESSION 


Neighbor CBIastN vs. 
4 DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Voteins) 1 

p value! 


336 


D88255 


Homo sapiens A30 
Vk germline gene, 
partial cds 


I 0.57 


3875983 


(Z81063) similar to Actinm-typ 
actin-binding domain containing 
proteins [Caenorhabditis 
eleeans] 


e | 
S I 

3.0 1 


337 


AF037261 


Homo sapiens SH3- 
containing adaptor 
molecule- 1 mRNA, 
complete cds 


0.57 


1 

| 1397341 


(Ufii^J Mrrlilaf to KineSflRTir 
protein; coded for by C. elegans 
cDNA ykl84h5.3; coded for by 
C. elegans cDNA vkl84hS S- 
coded for by C. elegans cDNA 
ykl3d7.3; coded for by C. 
elegans cDNA ykl3d7.5; coded 
for by C. elegans cDNA 
yk31el.5;co...>gi|3493541 
^/vruj / jo / ; Kinesin-UKe protein 
ZEN-4a [Caenorhabditis 
elegans] 


5 1 

2.3 1 


338 


U26595 


Rattus norvegicus 
prostaglandin F2a 
receptor regulatory 
protein precursor, 
mRNA. complete cds 


0.57 1 


2773 1 60 


(AF039656) neuronal tissue- 
enriched acidic protein [Homo 
sapiens] 


2.3 1 


339 


X69903 


R. norvegicus mRNA 
for interleukin 4 
receptor 


0.57 


2649193 


(AE001009) quinone-reactive 
Ni/Fe-hydrogenase B-type 
cytocnrome suounit (nydC) 
[Archaeoalobus fulaidus] 


IS I 


340 


Z74825 


S.cerevisiae 
chromosome XV 
reading frame ORE 
YOL083w 


0.57 1 


14*5 *n 10 


(U64S46) F47D2.5 gene 
product [Caenorhabditis 


1.4 J 


341 


AJ131469 


Foot-and-mouth 
disease virus O vpl 
gene, strain O/A/58 


0.57 


I 

91206 ( 


Droline-rich protein - mouse 
fragment) musculus] 


14 1 


342 


] 
i 

5 

AF011360 r 


VIus musculus 
egulator of G-protein 
ignaling 7 (RGS7) 
nRNA. complete cds 


057 


542514 o 


f elsolin - American lobster 


o.so 


343 


I 
r 
Is 

AF011360 In 


Aus musculus 
egulator of G-protein 
ignaling 7 (RGS7) 
nRNA, complete cds 


057 1 


o 

> 

1078946 a 


elsolin - American lobster 
gi|452313 aelsolin [Homarus 
mericanus] 


0.80 1 



US 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearest 

> 

ACCESSIOr 


Neighbor (BlastN vs. 
* DESCRIPTION 


Genbank) 
P VALUI 


Nearest Neier 
: ACCESSION 


ibor (BlastX vs. Non-Redundant ] 
DESCRIPTION 


P VALUE 


344 


j L39210 


Homo sapiens inosin 
monophosphate 
dehydrogenase type ] 
gene, complete cds 


e 
J 

0.57 


559526 


(X77466) 98.8kD polyprotein 
[Strawberry latent ringspot 
virus] 


0.79 


345 


1 U81523 


Human endometrial 
bleeding associated 
factor mRNA, 
comcjlete rH«; 


I \J.D 1 


! 211499 


(K01702) HMW/LMW collagej 
subunit precursor [Gallus callus 


n 

] 0.79 


346 


! U46561 


Tetrahymena 
thermophila 
polyubiquitin (TTU3) 
gene, complete cds, 
and RNA polymerase 
II subunit 2 (RPB2) 
gene, partial cds 


0.57 


2506493 


HYPOTHETICAL 100.5 KD 
PROTEIN IN IAP-CYSH 
INTERGENIC REGION 
>gi|8S2654 (U29579) alternate 
gene name ygcB; ORF_fi888 
[Escherichia coli] >ei|1789119 


0.60 


347 


X95543 


C.iaDOnica mRNA for 

legumin (clone 
CjLeg31) 


0.57 


1709261 


N h UROUL AMkN T TRIPLET 

M PROTEIN (160 KD 
NEUROFILAMENT 
PROTEIN) (NF-M) 
>gi[1083164|pir||S55395 
neurofilament protein M - rabbit 
(fragment) >gi[854353 


0.46 


348 I 


Y17282 


Homo saDien*; mRNA 
for cytokeratin type II 


0.57 


3044086 


(AF055904) unknown 
Myxococcus xanthus] 


0.45 


349 1 


X00716 


Frog mRNA fragment 
for alpha- A2- 
crystallin 


0.57 


3406654 


(AF079369) transcriptional 
repressor TUP1 [Dictyostelium 
discoideum] 


0.20 


350 1 


i 

X53238 


Klebsiella sp. 
bacteriophage Kll 
jene 1 fnrRMA 
Dolymerase 


0.57 


1228093 ( 


Z46913) polyketide synthase 


0.16 


351 


I 

X99012 


■{.sapiens FUS gene, 
jxon 12 


0.57 


( 

€ 

t 243898 A 


S78897) GOR=antigenic 
pitope [chimpanzees, Peptide, 
•27 aa] [Panl 


n non 


352 1 


s 

3 

AL008711 c 


i urn an DNA 
equence from PAC 
90N22 on 
hromosome Xp22.2 


0.57 


( 

1469545 jj 


U53585) fibronectin attachment 
rotein [Mvcobacterium avium] 


0.053 


353 1 


S 
b 

S74506 n 


OX9 [human, fetal 
rain. Genomic, 1494 
t, segment 3 of 31 


0.57 


0 

IT 

C( 

1326350 R 


J5S748) similar to potential 
ansmembrane domains in S. 
-revisiae nulcear division 
FT1 protein (SP:P38206) 


0.017 



) 7<f 
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SEC 
ID 


£ Neares 
ACCESSIOI 


t Neighbor (BlastN vs. 
V DESCRIPTION 


Genbank) 

D Wat t tt 


Nearest Neigl 

: ACCESSION 


lbor (BlastX vs. Non-Redundant 
DESCRIPTION 


Proteins) j 
P VALUE 


354 


1 D25542 


Human mRNA for 
golgi antigen gcp372 
complete cds 


0.57 


4063399 


(AF 102575) ceil surface protei 
DTFA [Dictyostelium 
discoideum] 


n 1 

0.005 1 


355 


1 AB015426 


Mus musculus rnRNi 
for alpha 1,3- 
fucosyltransferase EX 
_ complete cds 


\ 

0.57 


2661842 


(Y15732) DNA polymerase bet 
_ [Xenopus laevis] 


a 1 

7e-U 1 


356 


X51394 


Xenopus mRNA for 
APEG protein, 
containing a highly 
repetitive amino acid 
sequence 


0.57 


1929056 


(Y12090) putative 3,4- 
dihydroxy-2-butanone kinase 
[Lycopersicon esculentuml 


9e-12 1 


357 


I AB007918 


Homo sapiens mRNA 
for KIAA0449 
protein, partial cds 


0.57 


2833239 


hPiDiikMAL (JkOWTH 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|530823 (U12535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 


3e-13 1 




\ Ab(XJ1466 


Homo sapiens mRNA 
for Efsl, complete 
cds 


0.57 


2943716 


(D45027) 25 kDa trypsin 
inhibitor [Homo sapiens] 


2e-14 1 


359 , 


Y00760 


Rabbit mRNA for 
adult fast skeletal 
troponin-C 


0.57 


2576348 


(AC002400) Glutamyl tRNA 
synthetase [Homo sapiens] 


2e-28 I 


360 


X95153 


H.sapiens brca2 gene 
exon 3 > :: 

emb|A62778[A62778 
Sequence 19 from 
Patent WO9719110 


0.57 


3419847 


( AC004982) similar to yeast 
hypothetical protein ybk4; 
similar to P38164 
(PID:g586461) [Homo sapiens! 


2e-55 I 


361 J 


X85967 


B. vulgaris mRNA for 
^etavulgin 


0.56 


<NONE> 


<NONE> 


<NONE> 1 


362 1 


1 
( 

: 
C 

F 

s 
s 
s 

U09251 g 


»i y l. up i II j <x 

?enitalium DNA j 
*yrase subunit B 
romplete cds, DNA 
polymerase III beta 
ubunit (dnaN) and 
eryl-tRNA 
ynthetase (serS) 
enes, partial cds. 


0.56 


<NONE> 


<NONE> - 


:NONE> 1 


363 1 


C 

o 

fc 

g 

V00I58 sc 


'hloroplast Euglena 
racilis genes coding 
Dr transfer RNAs 
oecific for threonine, 
ycine, methionine, 
;rine and slutamine. 


0.56 


<NONE> 


<NONE> < 


NONE>| 



111 
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i Nearest Neighbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins 1 ! 1 


SEQ 
ID 


ACCESSION 


J DESCRIPTION 


P VALUE 


ACCESSION 


\ DESCRIPTION 


P VALUE 






Clostridium 










' 364 


D88151 


perfringens DNA for 
D-alanine:D- alanine 
ligase, cortical 
fragment- lytic 
enzyme 


0.56 


<NONE> 


<NONE> 


<NONE> j 


365 


U67478 


Methanococcus 
jannaschii section 20 
of 150 of the 
complete genome 


0.56 


<NONE> 


<NONE> 


<NONE> 


JOO 


L23800 


Tachyglossus 
aculeatus beta-globin 
homolog (HBB) 
jjene, complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 1 


367 


AB011I29 


Homo sapiens mRNA 
for KIAA0557 
protein, partial cds 


0.56 


<NONE> 


<NONE> 


<NONE> 1 


368 


L77034 


Homo sapiens 
(subclone 10_el0 
from PI H16) DNA 
sequence. 


0.56 


<NONE> 


<NONE> 


<NONE> J 


joy 


Z47202 


C. albicans gene for 
TFIIIB (BRF1) 
subunit. 


0.56 


<NONE> 


<NONE> 


<NONE> J 


370 


U53868 


Clostridium 
acetobutylicum 
mannitol-specific 
phosphotransferase 
system (PTS) system, 
mtlA, mtlR, mtIF, and 
mtlD genes, complete 
cds 


0.56 


' <NONE> 


<NONE> 


<NONE> 


371 


i 
( 

AF041259 ( 


Homo sapiens breast 
cancer putative 
transcription factor 
;ZABCl)mRNA, 
:omplete cds 


0.56 


<NONE> 


<NONE> 


<NONE>| 


372 


I 

i 
s 
F 

L42636 r 


Plasmodium 
alciparum variant- 
pecific surface 
)rotein (var-7) 
nRNA. complete cds. 


0.56 


2213557 ( 


Z97052) hypothetical protein 


8.8 1 
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H9 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN v S . Genbank) 


^ Nearest Neighbor (BlastX vs. Non-Redundant Proteinsrt 1 


SEQ 
ED 


ACCESSIO> 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 1 












HYPOTHETICAL 113.1 KD 




382 


X97986 


M.musculus mRNA 
for desmocollin type 
1 


0.56 


2497227 


PROTEIN IN PRE5-FET4 
INTERGENIC REGION 
>gi|1072409 (254141) unknowi 


l 1.7 1 


383 


AF087455 


Didelphis virginiana 
G protein receptor 
kinase 2 mRNA, 
complete cds 


0.56 


1213453 


(U12964) contains ankyrin-like 
repeats; similar to human 
desmoplakin repeat region 
[Caenorhabditis elegans] 


1.3 J 


384 


D800H 


Human mRNA for 
KIAA0189 gene, 
complete cds 


0.56 


226535 


protease [Hepatitis B virus] 


1.1 1 


385 


AJ002272 


Mus musculus mRN^ 
for HAP1-A protein, 
3* region 


0.56 


3327158 


(AB014572) KIAA0672 protein 
[Homo sapiens] 


1.0 I 


386 


L39210 


Homo sapiens inosine 
monophosphate 
dehydrogenase type II 
gene, complete cds 


0.56 


628431 


coat protein - strawberry latent 
ringspot virus 


0.77 


387 


X02770 


Mouse Thy- 1.2 gene 
5' untranslated region 
and exon 1 


0.56 


3327046 


(AB014516) KIAA0616 protein 
fHomo sapiens] 


0.59 I 






Schizosaccharomyces 
pombe Wiskott- 
Aldrich Syndrome 
protein homolog 
(wspl+) gene, 
complete cds, and 
BTF3/beta-NAC 
gene, partiai sequence 


0.56 


88466 


salivary proline-rich 
phosphoprotein precursor PRH1 
(allele PIF) - human >gi| 190484 
(K03203) prepro salivary 
proline-rich protein [Homo 
sapiens] >gi|190512 


0.35 1 


389 


X56747 


Rat rnRNA for fetal 
intestinal lactase- 
phlorizin hydrolase 
precursor, j? artial 


0.56 


2072742 


JZ48674) chitinase homologue 
r Sesbania rostrata] 


023 1 


390 


1 
1 

Y 12072 i 


G.arboreum mRNA 
or farnesyl 
pyrophosphate 
ivnthase 


0.56 


( 

296670 s 


X07882) Po protein [Homo 
apiens] 


0.20 1 


391 


I 

c 

z 
F 
{ 

S75756 C 


3l5=cyclin D- 
lependent kinases 4 
ind 6-binding 
>rotein/pl5 product 
exon/intron 1 ) 
human, brain tumors, 
jenomic, 753 nt] 


0.56 


F 

S 
> 

1082743 p 


rotein kinase (EC 2.7.1.37) 
PRK - human sapiens] 
gi| 1 09077 l|prf||20 19437 A 
rotein Tvr kinase I 


0.15' J 
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Nearest Neighbor (BlastN vs. GenbanJc) 



SEQ 

H> | ACCESSION 



392 I U62528 



393 I X96877 



394 1 S78788 



395 j AF006640 



DESCRIPTION 



jEquus cabailus rype 



II collagen mRNA, 
complete cds 



C.reinhardtii mRNA 
for unknown lumenai 
polypeptide 



cGATA-3 [chickens, 
liver, Genomic, 979 
nt, segment 4 of 4] 



Drosophila 
melanogaster Ste20- 
like protein kinase 
mRNA. complete cds 



396 I AF006640 



Drosophila 
melanogaster Ste20 
like protein kinase 
mRNA, complete cds 



P VALUE 



Nearest Neighbor (BlastX vs. No n- Redundant Proteins) 



ACCESSION 



DESCRIPTION 



0.56 



461671 



0.56 



3341678 



0.56 



2661590 



0.56 



1109830 



P VALUE 



[Segment 1 of 2] COLLAGEN 
.ALPHA UP CHAIN 



(AC003672) putative zinc finger 
protein [Arabidopsis thaliana] 



(AL009196) 1- 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score=59.41; 1- 
evidence_end; 2- 
evidence=predicted by match; 2- 
match_accession=AA950019; 2 
match_description=LD29959.5p 
rime LD Drosophila 
melanosas... 



(U41534) coded for by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegans] 



0.030 



5e-09 



2e-ll 



397 1 AE000716 



Aquifex aeolicus 
section 48 of 109 of 
the complete genome 



0.56 



1109830 



0.56 



3688350 



(U41534) coded for by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegans] 



IX ^- 1 ! W 1 UUL'UIUJ ^ ill 1,3 J 

(AL030996; dJllS$624.4 

(novel PUTATIVE protein 
similar to hypothetical proteins 
S. pombe C22F3.14C and C. 
elegans C16A3.8) [Homo 
sapiens] 



6e-12 



4e-13 



3e-66 



398 I Z36079 



S.cerevisiae 
chromosome II 
reading frame ORE 
YBR210w 



0.55 



<NONE> 



<NONE> 



<NONE> 



399 I Y17267 



Mus musculus mRNA 
for ubiquitin 
conjugating enzyme 



0.55 



400 I AC001461 



rlomo sapiens 
(subclone 2_g5 from 
BAC HI 07) DNA 
sequence 



<NONE> 



0.55 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

K> I ACCESSION 



DESCRIPTION | P VALUE 
Alouatta senicuius 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE! 



401 j AFO 19079 



breast and ovarian 
susceptibility 
(BRCA1) gene, 
partial cds 



0.55 



402 



M90058 



Human serglycin 
gene, exons 1,2, and 
3. 



<NONE> 



<NONE> 



<NONE> 



0.55 



<NONE> 



<NONE> 



403 j ABO 13469 



Mus musculus CLM2 1 
gene for cytohesin 2, 
complete and partial 
cds, alternative 
splicing 



0.55 



(Z68152) chitinase [Gossypiurn 
1729760 Ihirsutum] 



404 I AJ011592 



,405 | Z15118 



406 | 248951 



407 I U78726 



408 I AG0Q1389 



409 | M27640 



Bacteriophage PI ban I 
gen e 



0.55 



T.brucei kinetoplast 
maxicircle variable 
region DNA 



2493689 



PHOTOS YSTEM II 10 KD 
PHOSPHOPROTEIN deltoides] 
>gi|2143326|gnI|PID|e3 19090 
(Y13328) lOkDa 
phosphoprotein [Populus 
| deltoides] 



0.55 



S.cerevisiae 
chromosome XVI 
cosmid 9723 



2970432 



(AF049132) NADH 
dehydrogenase subunit 5 
[Florometra serratissima] 



0.55 



Homo sapiens mad 
protein homolog 
Smad2 gene, 
promoter, exon la 
and exon lb 



(AJ130783) APC2 protein [Mus 
4210432 [musculusl 



0.55 



Homo sapiens 
genomic DNA, 21q 
region, clone: 
9HllBm42 



0.55 



Plasmodium vivax 
major blood stage 

urface antigen gene. 

artial cds. 



0.55 



3319290 



(AF055994) thyroid hormone 
receptor-associated protein 
complex component TRAP220 
[Homo sapiens] 



125684 



KRUEPPEL PROTEIN 
>gi|72899|pir||TWFF Krueppel 
gap protein - fruit fly . 
(Drosophila sp.) melanogaster] 
>gi|224875|prfl|1202348A 
Krueppel gene 



549453 



x-LiWknb PEST- 

CONTAINING 
TRANSPORTER transporter - 
human >gi|458255 (U05321) X- 
iinked PEST-containing 
transporter [Homo sapiens] 



6.6 



6.5 



4.9 



4.9 



3.8 



3.8 



WO 01/02568 
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SEC 
ID 


vn Neares 
1 

ACCESSIOi 


t Neighbor (BlastN vs. 

V DESCRIPTION 
Fugu rubnpes mRN, 


Genbank) 
P VALU1 

\ 


Nearest Neip] 
i ACCESSION 


ibor (BlastX vs. Non-Redundant 
' DESCR TPTTOM 


Proteins) ] 
P VALUE! 


410 


1 D37977 


for sodium channel 
alpha subunit, pania 
cds 


1 

0.55 


I 143SCHS 


_ (D38024) ORF fHomo sapiens 


] 37 


411 


1 M88505 


Ostertagia ostenagi 
cathepsin B-Iike 
cysteine protease 
gene, partial cds. 


0.55 


J 3941277 


(AF000900) p45 [Rattus 
norveeicus] 


29 j 


412 


I U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.55 


1 ■ 2570154 


(AB008376) I7-kDaPKC- 
potentiated inhibitory protein of 

"DDI rc.,o .c i 

„ ££_j Lous scrota] 


2.8 I 


413 


J U89241 


Human mibp gene, 
partial cds 


0.55 


J 4097465 


(U62253) 16kDa secretory 
protein fSus scrofal 


2.2 I 


414 


AF027151 


Aenopus laevis 
survival of motor 
neuron protein 
interacting protein 1 
(SIP1) mRNA. 
complete cds 


0.55 


! 4007790 


(AL034463) putative single- 
strand polynucleotide binding * 
protein [Schizosaccharomyces 
pombe] 


1.7 J 


415 


AF006821 


Buto m annus 
natriuretic peptide 
receptor C mRNA, 
partial cds 


0.55 


2245075 


(Z97343) GTP-binding RAB2A 
protein 




416 1 


Y12736 


Lactococcus lactis 
cremoris plasmid 
pJW565 DNA. 
HabiiM, IlabiiR genes 
and orfX 


055 I 


3386334 


(AF035120) type I procollagen 
^ro-iiloha ^ chain fP'anic 
familiaris] 


1.7 J 
1.3 I 


417 I 


< 
i 

U38307 s 


Mus musculus 
collagen alpha- 1 type 
1 gene. 5' flanking 
egion, partial 
equence. 


0.55 


1362802 ( 


lastric mucin - human 
fragment) >si|547517 


1.3 I 


418 I 


J 

D 13473 


vlouse mRNA for 

lad51 protein 

Sungarus fasciatus 


055 


( 

1374698 I 


D83032) nuclear protein, 
JP220 [Homo sapiensl 


1.3 1 


419 I 


a 

g 
s 

AF045238 £ 


cetylcholinesterase 
ene, alternatively 
Dliced products* 
artial cds 


0.55 J 


C 

3261734 R 


£94752) hypothetical protein 
v 1004c 


0.99 1 


420 1 


tr 
fr 
1< 
1* 

AE 000795 Jet 


lethanobactenum 
lermoautotrophicum 
om bases 1 to 
3208 (section 1 of 
*8) of the complete 
:nome 


0.55 1 


186396 |sa 


49413 1) mucin [Homo 
piens] 


0.97 ] 
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SEQ 
ED 



423 



428 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



421 1 X99537 



422 I UQ8147 



P VALUE 



Y.lipolytica SEC62 



gene 



Aquiiegia sp. 
phytochrome 
(PHYB/D) gene, 
partial cds. 



Nearest Neighbor TBIastX vs. Non-Redundant ProtdnTT 



ACCESSION 



0.55 



H.sapiens CpG DNA 
Jclone 12c8, reverse 
Z56586 read cpg!2c8.rtld . 



424 | U39442 



Mus musculus 
glutamine:fructose-6- 
phosphate 
[amidotransferase 
(GFAT) gene, 5 
region and partial cds 



0.55 



0.55 



425 I K02298 



(Rat chymotrypsin B 
(chyB) gene, 
{complete cds. 



426 | X84792 



M.musculus clusterin 
:ene 



427 I UQQ185 



apra aegagrus 
Saanen and Weisse 
lEdel breeds DR beta- 
'chain antigen binding 
domain, MHC class II 
DRB 



H.sapiens CpG DNA, 
clone 178al2 t reverse 
Z54946 read cpgl78al2.rtla . 



429 



430 



JOryctoIagus 
Icuniculus anion 
exchanger 3 brain 
Jisoform (AE3) 
AFQ31650 rnRNA, complet e cds 

Bovine adenylyl 
(cyclase Type I 
M25579 ImRNA. complete cds. 



0.55 



0.55 



0.55 



0.55 



0.55 



431 



Z48796 



H.sapiens Ski-W 
|mRNA for helicase 



0.55 



0.55 



0.55 



3876397 



2338024 



DESCRIPTION 



(ZS1068)F25H5T 



[Caenorhabditis elegansl 



(AF005370) ribonucleotide- 
reductase, large subunit 



3320122 



282600 



3413810 



1652475 



2507136 



807646 



1778210 



2649040 



330452 



(U46007) espin [Rattus 
norvegicus] 



hypothetical protein - 
Mycoplasma hyorhinis 



(Y 17034) Bassoon [Mus 
musculus] 



(D90905) hypothetical protein 



SUBTILIN BIOSYNTHESIS 
PROTEIN SPAB 



(Ml 7294) unknown protein 
[Human herpesvirus 4] 



(U68412) fibrillar collagen 
"Arenicola marina] 

(AE000997) conserved 
hypothetical protein 
Archaeoglobus fulgidusl 



Ip value! 



0.58 



(M 14708) DNA polymerase 
[Human cytomegalovirus] 



0.57 



0.44 



0.43 



0.33 



0.25 



0.19 



0.065 



0.044 



0.023 



0.023 



m 



WO 01/02568 



PCT/US00/18374 



SEQ 
ED 



Nearest Neighbor (BlastN vs. Genbankl 



accession! description 



432 1 M80234 



Cow dopamine 
transporter mRNA, 
[putative cds. 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



0.55 



3874972 



DESCRIPTION 



P VALUEI 



(Zyy 709) similar to Elongation 
factor Tu family (contains 
ATP/GTP binding P-Ioop); 
cDNA EST EMBL:D76223 
comes from this gene; cDNA 
EST yk478c5.5 comes from this 
gene [Caenorhabditis elegansl 



433 I U91616 



434 I D1Q9I0 



[Human I kappa B 
lepsilon (IkBe) 
mRNA, complete cds 
Arabidopsis thaJiana 
Atpk7 gene for 
serine/threonine 
[protein kinase, 
[complete cds 



435 



L22013 



Swinepox virus 
I complete ORFS 
C20L-C1L > :: 
gb|I58297|I58297 
Sequence 14 from 
patent US 5651972 



0.55 



0.55 



436 



Z92653 



Human 

I immunodeficiency 
I virus type 1 env gene 



0.54 



0.54 



3875577 



3876072 



<NONE> 



(Z68314) similar to G-protein; 
cDNA EST EMBL.-C 1 1959 
comes from this gene; cDNA 
EST EMBL:C10341 comes 
from this gene; cDNA EST 
yk494e4.3 comes from this 
gene; cDNA EST yk448a8.5 
comes from this gene comes 
from this gene; cDNA EST 
EMBL:C 10341 comes from this 
gene; cDNA EST yk494e4.3 
comes from this gene; cDNA 
EST yk448a8.5 comes from this 
gene [Caenorhabditis elegans] 
'gi|3SS0364|gnl|PID|el349948 
(Z83016) similar to G-protein; 
cDNA EST EMBL:C 1 1 959 
comes from this gene; cDNA 
ESTEMBL:C10341 comes 
from this gene; cDNA EST 
yk494e4.3 comes from this 
gene; cDNA EST yk448a8.5 
comes from this gene 
[Caenorhabditis elegansl 

(ZS 1505) Similarity to 
Metanococcus hypothetical 
protein 0682 (TR:Q58095) 
[Caenorhabditis elegansl 



4e-04 



<NONE> 



<NONE> 



<NONE> 



7e-06 



4e-42 



<NONE> 



<NONE> I 



WO 01/02568 



PCT/US00/18374 



SEQ 

_ id iacces siqnI DESCRIPTION 
^ — 



Nearest Neighbor (BlastN vs. Genbank) 



P VALUE 



437 I KOI 992 



438 | AE00I415 



ti.eoii pnosphate- 
(repressive 
peripiasmic 
phosphate-binding 
protein (phoS), 
peripheral membrane 
proteins (pstC, pstB 
and phoU) and 
integral membrane 
protein (pstA) genes, 
(c omplete cds , 
Masmodium 
falciparum 
chromosome 2, 
[section 52 of 73 of 
[the complete 
Jsequence 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



0.54 



439 | AF064030 



Helianthus tuberosus 
lectin 2 mRNA, 
complete cds 



440 1 X12591 



E.eoli plasmid DNA 
[for colicin E9 



0.54 



<NONE> 



0.54 



441 



Caenorhabditis 
elegans YNKl-a 
U73679 I mRNA, complete cds 



442 



Unidentified 
bacterium DNA for 
Z93990 |16S ribosomal RNA 



B. vulgaris mRNA for 
443 | X85967 betavulgin 



444 I U76524 



445 I X71800 



[Sambucus nigra 
Iribosome inactivating 
protein precursor 
[mRNA. complete cds 



H. sapiens gene for 5S 
IrRNA (640 bp) > :: 
emb|X71801|HS5SR6 
40B H. sapiens gene 
jfor 5S rRNA (640 bp) 



0.54 



0.54 



0.54 



0.54 



0.54 



Human mibp gene 
U89241 loartialcds 



0.54 



0.54 



<NONE> 



P VALUE! 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



757836 



151377 



3322653 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



I <NONE> 



<NONE> 



(Z37980) ORF12 [Escherichia 
colil | 8.3 



(M80653) tetraheme 
Pseudomonas stutzeril 



6.2 



(AE001216) T. pallidum 
predicted coding region TP 0369 1 



(U62253) 16kDa secretory 
4097465 [protein [Sus scrofal 



2.7 



2.2 



WO 01/02568 



PCT/US00/18374 





#1 Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor f BlastX vq Mnn.p.H.,n^ n , p — 1 


SEC 
ID 


1 


si\ np^ri? rpnrTOM 


1 P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE! 


447 


1 L16013 


Rattus norvegicus Q- 
like gene sequence 


0.54 


j 3087760 


(AJ005583) p75 protein 
_ IfCrypthecQdimum cohniil 

(Y10438) FK506 polyketide 
synthase 


0.95 J 


448 


J U60275 


Capra hircus skeletal 
muscle voltage-gated 
chloride channel 
gClC-1 mRNA, 
partial cds 


0.54 


1 1781344 


0.95 j 


449 


U36795 


Myxococcus xanthus 
rfbABC O-antigen 
biosynthesis operon, 
rfbA, rfbB, and rfbC 
genes, complete cds. 


0.54 


I 3877232 


(Z81540) predicted using 
Genefinder 


0.74 1 


450 


AF053091 


Drosophila 
melanogaster eyelid 
(eld) mRNA, 
complete cds 


0.54 i 


2144110 


zinc finger protein RIZ - rat 
>gi 1949996 


0.14 j 


451 


V00602 


Genome of the 
bacteriophage fd 
(Inoviridae). 


0.54 


2661620 


(AL009197) hypothetical 
protein 


0 11 I 


452 J 


U60800 < 


Human semaphorin 
[CD 100) mRNA, 
:omplete cds 


0.54 


jl 

J c 

125682 £ 


KkkA UN. ULTRA HIGff 

SULFUR MATRIX PROTEIN 
(UHS KERATIN) 
>gi|109116|pir||A36686 ultra- 
ligh-sulfur keratin - sheep 
>gi|1306 (X55294) ultra high- 
>ulphur keratin protein [Ovis 
iriesj 


0.003 j 


453 j 


X85969 s 


J.coelicolor secD, 
ecF & apt aenes 


0.54 


( 

If 

IT 

C 

c 
E 

3874972 2 


Z99709) similar to Elongation 
actor Tu family (contains 
^TP/GTP binding P-Ioop); 
DNA EST EMBL.D76223 
omes from this gene; cDNA 
ST yk478c5.5 comes from this 
ene fCaenorhabditis elesans] 


7e-06 


454 1 


h 
L 

Y08265 p 


I.sapiens mRNA for 
)AN26 protein, 
artial 


0.54 J 


0 
re 
tr 

3875131 fC 


!70750) similar to vanadate 
distance protein 
ansmembranous domains 
laenorhabditis elepans] 


5e-12 1 



WO 01/02568 



PCT/US00/18374 



ID 


%l Nearest 
) 1 

[accession 


Neighbor (BlastN vs. 

description 

riydromantes 


Genbank) 
P VALUE 


j Nearest Neigh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant f 
| DESCRIPTION 


*rotei ns) 
P VALUE 


455 


J U89613 


platycephalus 
cytochrome b (cytb) 
gene, mitochondrial 
gene encoding 
mitochondrial 
protein, partial cds 


0.53 


I <NONE> 


<NONE> 


<NONE>| 


456 


AF034597 


Habrobracon hebetor 
cytochrome oxidase 
II gene, partial cds; 
and tRNA-Asp, tRNA 
His, and tRNA-Lys 
genes, complete 
sequence, 

mitochondrial genes 
for mitochondrial 
products 


0.53 


<NONE> 


<NONE> 


<NONE> 1 


457 I 


K02653 


Yeast (S.cerevisiae) 
tau repetitive element 
and Cys-tRNA. 


053 I 


<NONE> 


<NONE> 


<NONE> 1 


458 1 


X53416 


Human mRNA for 
actin-binding protein 
Xilamin) 


0.53 


2134839 


bullous pemphigoid antigen 2 - 
Tuman 


6.2 J 


459 | 


I 

s 
c 

2 
d 

d 

M55545 c 


Drosophila 
ubobscura alchohol 
lehydrogenase (Adh) 
ene, and alchohol 
ehydrogenase (Adh- 
up) gene, complete 
ds's. 


053 | 


h 

' 2136865 


air keratin cysteine rich protein 
sheep 


2.1 j 



m 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















460 


U I 9362 


■ MtdLhanobaaertum — 
thermoautotrophicum 
methylene- 
tetrahydromethanopte 
rin dehydrogenase 
(mtd), 

imidazoleglycerol- 
phosphate 
dehydrogenase 
(hisB), and putative 
ferredoxin (fdxA) 
genes, complete cds, 
orf9 gene, partial cds, 
orfs ... 


0.53 


731969 


HYPOTHETICAL 91.6 KD 
PROTEIN IN HXT8-CRT1 
INTERGENIC REGION 
>gi| 1078261 |pir||S50773 
probable membrane protein 
YJL212c - yeast 
(Saccharomyces cerevisiae) 
>gi|496950 (Z34098) ORF 
[Saccharomyces cerevisiae] 
>gi| 1015596 (Z49487) ORF 
YJL212c 


0.54 


461 


AB011527 


Rattus norvegicus 
mRNAfor MEGF1, 
complete cds 


0.53 


417037 


GERM CELL-LESS PROTEIN 
fruit fly (Drosophila 
melanogaster) >gi| 157490 
(M97933) germ cell-less protein 
[Drosophila melanogaster] 


3e-06 


462 


U64313 


Bacillus firmus MsyB 
gene, 5' upstream 
region and partial cds 


0.52 


<NONE> 


<NONE> 


<NONE> 


463 


AF008590 


Caenorhabditis 
elegans paraquat 
responsive protein 
(CePqM132) mRNA, 
complete cds 


0.52 


<NONE> 


<NONE> 


<NONE> 


464 


L 10245 


Mus saxicola 
spermidine/spermine 
N 1 -acety ltransferase 
(SSAT) gene, 
complete cds. 


0.52 


<NONE> 


<NONE> 


<NONE> 


465 


< 

AF027173 < 


Arabidopsis thaliana 
:eliulose synthase 
:atalytic subunit (Ath- 
\) mRNA, complete 
:ds 


0.52 


1 
1 

124263 < 


NSULIN-Ukk GROWTH 
FACTOR IB PRECURSOR 
JGF-IB) (SOMATOMEDIN C) 
>gi|69361|pir[|IGHUlB insulin- 
ike growth factor IB precursor - 
luman prepropeptide [Homo 
;apiens] 


7.7 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Caenorhabditis 










466 


AL021066 


elegans cosmid 

1B20, complete 
sequence 
[Caenorhabditis 
elesans] 


0.52 


2589162 


(D88451) aldehyde oxidase [Zea 
mays] 


6.0 


•+0 / 


AF038588 


Porphyra linearis 18S 
ribosomal RNA gene, 
3' partial sequence 


0.52 


1055055 


flrtOS^fft coded for bv C 
elegans cDNA yk37gl.5; coded 
for by C. elegans cDNA 
yk5c9.5; coded for by C. 
elegans cDNA ykla9.5; 
alternatively spliced form of 
F52C9.8b 


4.6 


468 


AE001125 


Borrelia burgdorferi 
(section 1 1 of 70) of 
the complete aenome 


0.52 


4115827 


(AB021287) polyprotein 
[Hepatitis G virus] 


2.0 


4oy 


AF006640 


Drosophila 
meianogaster Ste20- 
like protein kinase 
mRNA. complete cds 


0.52 


1109830 


(U41534) coded for by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 tarmly. 
[Caenorhabditis elegans] 


0.002 


470 


U90177 


Aplysia californica 
ubiquitin carboxyl- 
terminal hydrolase 
(Ap-uch) mRNA, 
complete cds 


0.51 


<NONE> 


<NONE> 


<TN(JlNil> 


471 


Z28304 


S.cerevisiae 
chromosome XI 
reading frame ORF 
YKR079c 


0.51 


<NONE> 


<NONE> 


<NUiNr.> 


472 


Z92837 


Caenorhabditis 
elegans cosmid 
KUJfcl, complete 
sequence 
[Caenorhabditis 
elesans] 


0.51 


123506 


HYDROPHOBIC SEED 
PROTEIN (HPS) 


7.6 


473 


D13803 


Mouse mRNA for 
RecA-Iike protein 
MmRadSl, complete 
cds 


0.51 


3327228 


(ABO 14607) KIAA0707 protein 
[Homo sapiens] 


4.5 


474 


X07187 


Pea hsp2 1 mRNA 


0.51 


3328678 


(AE001299) hypothetical 
protein [Chlamydia trachomatis] 


4.4 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearest Neighbor CBlastN v S 

ACCESSION | DESCRIPTION 
CCAA 1/ennaneer- 


Genbank) 
P VALUE 


f Nearest Neieh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant I 
DESCR rPTrnisr 


*roteins) 1 

P value! 


475 J S63168 


binding protein 
deita=transcription 
factor CRP3 homolog 
1 [human, prostate 
icarcinoma cell line 
jLNCaP, Genomic, 
1594 nt] 


0.51 


1 1653215 


(D90911) apolipoprotein N- 
acyltransferase [Synechocystis 
.spj 


1.2 1 


476 1 U67078 


Xenopus laevis C2- 
HC type zinc finger 
protein X-MyTl 
mRNA, complete cds 


0.51 


3850320 


(AF067520) PITSLRE protein 
kinase beta SV2 isoform [Homo 
sapiens] 


0.17 1 


477 J L38933 


Homo sapiens GT198 
mRNA, complete 
ORF 


051 I 


3219965 


HYPOTHETICAL 100.6 KD " 
TRP-ASP REPEATS 
CONTAINING PROTEIN 
C2C6.04C IN CHROMOSOME 
I 


0.059 1 


I I N-ycopersicon 
1 1 esculentum 
478 1 AF001000 polygalacturonase 1 


0.50 1 


<NONE> 


<NONE> 




1 l c 

1 r 

479 1 228304 h 


>.cerevisiae 
chromosome XI 
eading frame ORF 
rKR079c 


0.50 


<NQNE> 




<NONE>| 
<NONE> 1 


480 1 X97225 I 


)ncorhynchus keta 
GF-II gene 


0.50 


<NONE> 


. <NONE> 

<NONE> 


cNONE> J 


I 1 r 

II l c 

1 481 | AJOQ1388 [rr 


fomo Sapiens, RP58 
DNA for complete 
iRNA 


0.50 1 


<NONE> 


<NONE> < 


:NONE> 1 



WO 01/02568 



PCT/US00/18374 



SEQ 
ED 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION DESCRIPTION 



Homo Sapiens, RP5S 
IcDNA for complete 

481 | AJ001388 [ r aRNA 

'.occultum 23S 



_482 I M86626 



483 1 U76523 



484 | AF031663 



ribosomal RNA, 
partial cds. 



Sambucus nigra lectin 
precursor mRNA, 
complete cds 



P VALUE 



0.50 



0.50 



0.50 



485 



U32729 



486 I AF067198 



487 | M23442 



Mus musculus striatin 
mRNA, c omp lete cds [ 0.50 

Haemophilus 
influenzae Rd section | 
44 of 163 of the 
complete genome ! 0.50 



Dictyostelium 
discoideum clone 
9.10Tdd-3 and RED 
repetitive elements, 
martial sequence 



Human interieukin 4 
(IL-4) gene, complete) 
cds. 



0.50 



0.49 



488 1 U16367 



489 I AF001000 



490 | 218920 



491 I D86983 



Caenorhabditis 
elegans POU 
homeobox protein 
CEH-18 (ceh-18) 
mRNA, complete cds.j 0.47 



Lycopersicon 
esculentum 
polygalacturonase 1 



Yersinia 
enterocolitica wbb 
gene cluster 



0.45 



Human mRNA for 
KIAA0230 gene, 
partial cds 



0.41 



0.35 



Heliamhus tuberosus 
lectin 2 mRNA, 
492 | AF064030 [complete cds 



0.33 



Nearest Neighbor (BlastX vs. Non-Redundant IteT 



ACCESSION 



DESCRIPTION 



|P VALUEl 



<NONE> 
<NONE> 



<NONE> 
<NONE> 



1722856 



CHROMOSOME ASSEMBLY 
PROTEIN XCAP-E African 
clawed frog >gi|563814 
(U13674) XCAP-E [Xenopus 
|laevis] 



(M63730) BPAG2 [Homo 
179521 [sapiens] 



(Z92829)F10A3.15 
3875699 UCaenorhabditis elegansl 



<NONE> i 
<NONE> 



3.2 



3.2 



2494740 



(HYPOTHETICAL 28.3 KLD 

PROTEIN IN GBD 5'REGION 
(ORF4) >gi|2120954|pir||I39562 
ORF4 - Alcaiigenes eutrophus 
>gi(695274 (L36817) QRF4 



0.65 



<NONE> 



<NONE> 



0.008 



(AF098499) contains similarity 
Ito Saccharomyces cerevisiae 
MAPI protein (GB.U19492) 
3786409 fCaenorhabditis elegans] 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



206712 



(M64793) salivary proline-rich 
protein fRattus norvegicus] 



<NONE> 



_<NONE> 



<NONE> 



8.9 



<NONE> 



<NONE> 



4e-05 



<NONE> 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. GenbankT 



SEQ 

IP 1 ACCESSION 



496 



DESCRIPTION 



493 | AF067083 



494 1 Y15520 



499 



Vitreoscina sp. outer 
membrane protein 
homolog gene, 
complete cds; Trp 
repressor binding 
protein gene, partial 
cds; and unknown 

:enes 

Papio hamadryas 
anubis gene encoding 
fertilin alpha-II 



P VALUE 



495 1 U33475 



D88356 



Alestes sp. 
ependymin mRNA, 
partial cds 



Mouse DNA for 8- 
oxodGTPase, 

complete cds 

Methanococcus 



497 I U67603 



jannaschii section 145 
of 150 of the 
complete genome 



498 1 U82386 



Malurus cyaneus 
microsatellite McyU2 



0.33 



0.29 



0.28 



0.22 



0.22 



Z49625 



500 I U64830 



501 I M24543 



S.cerevisiae 
chromosome X 
reading frame ORF 
YJR125c 



Dictyostehum 
discoideum AX2 
protein tyrosine 
kinase gene, complete 
cds 

Human prostate- 
specific antigen (PA) 
gene, complete cds. 



0.22 



Nearest Neighbor (BlastX vs. Non- Redundant Proteins) 



ACCESSION 



DESCRIPTION 



401553 



2408049 



P VALUEl 



HYPOTHETICAL 24.5 KD 
PROTEIN IN NADB-SRMB 
INTERGENIC REGION 



3913078 



<NONE> 



(Z99164) hypoth etical protein 

AkVL HVDkOCARljON 

RECEPTOR NUCLEAR 
TRANSLOCATOR 
HOMOLOG (DARNT) 
(TANGO PROTEIN) 
transcription factor [Drosophila 
melanogasterl 



<NONE> 



8.3 



3.1 



1.4 



<NONE> 



2209261 



0.21 



0.21 



0.21 



992631 



<NONE> 



(U51222) p40 [Streptomyces 
halstedii] 



(U29I31) Mg-chelatase subunit 
JSynechocystis sp.] 



<NONE> 



<NONE> 



2764859 



<NONE> 



(X9791S) gene 12.1 
[Bacteriophage SPP11 



8.3 



<NONE> 



<NONE> 



6.0 



113 



WO 01/02568 
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Nearest Neighbor TBlastN vs. GenbankT 

SEQ[ 

IP I ACCESSION 



503 



X87618 
X71591 



504 1 X57808 



Nearest Neighb or (BlastX vs. Non-Redundant PrTteWT 



DESCRIPTION | P VALUE [ ACCESSION 



B.taurus mRNA for 
thrombospondin 
(partial) 21 62 b] 
B.taurus 
microsatellite 
sequence INRAQ48 



DESCRIPTION 



ut)0o2b protein - 



I P VALUE I 



0.21 



Human germline 
immunoglobuJin 
lambda light chain 
gene 



505 1 U95098 



506 



U84216 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



Mycobacterium 
fortuitum plasmid 
pJAZ38 replication 
protein Rep (rep) 
gene, complete cds 



0.21 



2146000 



1354453 



0.21 



2119158 



507 1 U31463 



5081X51508 



509 1 AF086476 



510 I AF077Q06 

511 1 X75480~~ 



Rattus norvegicus 
nonmuscle myosin 
heavy chain-A 
mRNA, complete cds 



0.21 



2497139 



Mycobacterium tuberculosis 
tuberculosis] 

>gi| 1 694863 |gnl|P!D|e283373 
(Z83018) hypothetical protein 
Rv2968c [Mycobacterium 
tuberculosis! 



(U52830) orf [Homo sapiens! 



procollagen type V alpha 2 - 
mouse >gil309181 



0.21 



2499087 



Rabbit mRNA for 
aminopeptidase N 
(partial) 



Homo sapiens full 
length insert cDNA 
clone ZD88FI2 
Helicobacter pylori 
plasmid pHPM186, 
complete sequence 



0.21 



3880111 



hyfot&etk^e ;o kd — 
protein in abf2-chl12 
intergen1c region 

gi| 1 078003 |pir||S52835 
hypothetical protein YMR075w 
yeast (Saccharomyces 
cerevisiae) >gi|763022 
(Z48952) unknown 
[Saccharomyces cerevisiae] 

GL UCOSE : GLYCOPROTEIN 
GLUCOSYLTRANSFERASE 
PRECURSOR (DUGT) 
glucosyltransferase - fruit fly 
(Drosophila sp.) 
glucosyltransferase precursor 
Drosophila melanogaster] 



3.5 



2.7 



2.7 



2.0 



0.21 



630864 



(ZS1 130) predicted using 
Genefinder 



E.gunnii CAD gene. 



0.20 



0.20 



<NONE> 



0.20 



<NQNE> 
<NONE> 



LRR47 protein - fruit fly 
(Drosophila melanogaster) 
>gi|4 15947 (X75760) LRR47 
[Drosophila melanogaster] 



<NONE> 



<NONE> 



<NONE> 



0.003 



0.002 



le-06 



<NONE> 



<NONE> 



WO 01/02568 PCT/US00/18374 



If I Nearest Neighbor TBlastN vs. Genbank) 



SEQ 

ID [ACCESSION 





WO 01/02568 



PCT/US00/18374 





1 Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Protein.^ 1 


SEQ 
ID 


ACCESSION 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


521 


AF034460 


" reniciilium tnomn 
internal transcribed 
spacer 1, 5.8S 
ribosomal RNA gene 
and internal 
transcribed spacer 2, 
complete sequence; 
and 28S ribosomal 
RNA gene, partial 
sequence 


! 0.20 


114136 


AMINO-ACID 
ACETYLTRANSFERASE 
Pseudomonas aeruginosa 
>gi|151036 (M38358) N- 
acetylglutamate synthase 
[Pseudomonas aeruginosa] 


0.39 1 






Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.20 


2842674 


"^Ub DUMAliN l_LA53~Z; 

ASSOCIATING FACTOR 1 (B- 
CELL-SPECIFIC 
COACTIVATOR OBF-1) (OCT 
BINDING FACTOR 1) (BOB- 
l)(OCA-B)Bobl,B-ceII- 
s pec i tic - mouse 
>gi|I8SlSlS|bbs|179852 
mBobl=B-ceII specific 
transcriptional coactivator line 
J558L, Peptide, 256 aa] 
>gi| 1353792 (U43788) Oct 
binding factor 1 [Mus musculus] 


0.073 I 


523 


X95971 


S.Iividans groEL2 
gene 


0.20 


3925277 


(ALU3J64oj similar to 
Uncharacterized protein family 
UPF0034, Double-stranded 
RNA binding motif; cDNA EST 
yk4S9b3.5 comes from this 
gene; cDNA EST yk439g7.5 
comes from this gene 
Caenorhabditis elegans] 


4e-19 1 


524 


i 

L41502 < 


Ovis aries 
vasopressin VI 
receptor (V1R) gene, 
:omplete cds 


0.19 


<NONE> 


<NONE> 


<NONE>| 


525 


J 
c 
c 

s 

J03885 c 


IC.pneumoniae 
)xalacetate 
lecarboxylase alpha 
.ubunit gene, 
:omplete cds. 


0.19 


<NONE> 


<NONE> 


<NONE> 


526 


I 

s 
c 

AE001451 c 


ielicobacter pylori, 
train J99 section 12 
>f 132 of the 
omplete genome 


0.19 


<NONE> 


<NONE> 


cNONE> | 



WO 01/02568 PCT/US00/18374 



SEC 
rn 


1 Nearest 
) 

A /^TT O O T/~~\ * 

ACCESS IO I 


Neighbor (BlastN vs. 
< DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neieb 
ACCESSION 


bor CBlastX vs. Non-Redundant I 
DESCRIPTION 


^roteins) 
P VALUE 


527 


D88084 


Pediculans 
verticillata 
chloroplast DNA, 
intergenic region 
between trnT(UGU) 
and trnL(UAA)5'exoi 


i 0.19 


I <N0NE> 


<NONE> 




528 


U67599 


Methanococcus 
jannaschii section 14 
of 150 of the 
complete genome 


1 

0.19 


<NONE> 


<NONE> 




529 


J05500 


Human beta-spectrin 
(SPTB) mRNA, 
complete cds. 


0.19 


<NONE> 


<NONE> 


<NONE>| 


530 


Y10137 


M.mycoides ftsY 
gene homologue and 
gene encoding 
hypothetical protein 


0.19 


s <NONE> 


<NONE> 


<NONE> I 


531 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) niRNA, complete 
cds 


0.19 j 


<NONE> 


<NONE> 


<NONE> 1 


532 


D43805 


Mouse thymic 
stromal cell mRNA 
forTLSF-beta, 
complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> J 


533 


< 

AJ012585 | 


Tetrahymena 
theimophila 
macronuclear gene 
encoding ribosomal 
protein L3, exons 1-2 


0.19 J 


<NONE> 


<NONE> 


<rNOTsrP^ 1 


534 


1 

f 

X51475 c 


3rassica napus 5- 
inolpyruvylshikimate- 
J-phosphate synthase 
lene 


0.19 I 


<NONE> 


<NONE> 


<NONE> I 


535 


h 

AF074386 n 


Jambucus nigra 
evein-Iike protein 
iRNA, complete cds 


0.19 


<NONE> 


<NONE> 


cNONE> 1 


536 


S 
c 
r< 

Z49625 \ 


.cerevisiae 
hromosome X 
-ading frame ORP 
r JRl25c 


0.19 1 


<NONE> 


<NONE> < 


:NONE> | 



111 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



537j_X63741 



538 



539 



Y11255 



L63537 



H.sapiens pilot 



mRNA 



O.latipes mRNA for 
annexin max4 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



0.19 



<NONE> 



DESCRIPTION 



<NONE> 



0.19 



<NONE> 



<NONE> 



Oncorhynchus mykiss 
(clone Jb-10) beta-2 
microglobulin (B2m) 
mRNA. complete cds 



N.tobacum T92 gene 
for auxin-binding 
540 I X70903 [protein 



0.19 



<NONE> 



<NONE> 



0.19 



<NONE> 



<NONE> 



P VALUE! 



<NONE> 



<NONE> 



<NONE> 



541 



543 



U61958 



Caenorhabditis 
elegans cosmid 

C25A8 

Macaca fascicularis 



542 | U33959 



fertilin beta mRNA, 
complete cds 



Z49835 



H.sapiens mRNA for 
protein disulfide 
isomerase 



0.19 



<NONE> 



<NONE> 



0.19 



<NONE> 



544 I AF035458 



Spinacia oleracea 
heat shock 70 protein 
protein, complete cds 



545 1 U23441 



Tetrahymena 
thenmophila B 
nternal deletion 
sequence. 



546 I U53921 



Pneumocystis carinii 
major surface 
glycoprotein 



<NONE> 



0.19 



2113940 



(Z95356) hypothetical protein 
Rv2507 



0.19 



267293 



1 PROBABLE E4 PROTEIN 
papillomavirus (type 1) 
>gi|610l5 (X62844) E4 gene 
Jproduct [Pygmy chimpanzee 
[papilloma virus type 11 



0.19 



(Z66563) F46C3.2 
3877185 [Caenorhabditis elegans] 



0.19 



3548901 



(AF052502) DA26 homolog 
[Epiphyas postvittana 
nucleopolyhedrovirus] 



<NONE> 



<NONE> 



9.4 



9.4 



9.3 



547 ] LI 1002 



Rat ankyrin binding 
glycoprotein- 1 related 
mRNA sequence. 



0.19 



3337352 



(AC004481) putative chromatin 
structural protein Supt5hp 



548 



549 



U67560 



Methanococcus 
jannaschii section 102 
of 150 of the 
complete genome 



0.19 



U 18424 



Mus musculus 
bacteria binding 
macrophage receptor 
MARCO mRNA, 
complete cds. 



0.19 



3183689 



(Y13585) serotonin receptor 4 
[Cavia porcellus] 



3659853 



(AF0S90S3) complement 
component ClqB like protein 



9.1 



8.7 



7.1 



i 



WO 01/02568 



PCT/US00/18374 



SEQ 
ED 



550 



551 



552 



553 



554 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



X66467 



AF003487 



J05087 



AF080464 



U78876 



555 



P VALUE 



Syngaster lepidus 16S 
ribosomal RNA gene, 
partial sequence 



Rat calmodulin- 
sensitive plasma 
membrane Ca2+- 
transporting ATPase 
(PMC A3) mRNA, 
complete cds. 



.Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



0.19 



Homo sapiens 
glutamate 
oxaloacetate 
transaminase 



Human MEK kinase 

mRNA, complete 
cds 



Vigna radiata mRNA 
for proton 
pyrophosphatase. 
AB009077 [complete cds 



0.19 



0.19 



0.19 



DESCRIPTION 



3122039 



(U58751)C07G1.7gcne 
product [Caenorhabditis 
elegans) 

OIHYDROPYRIMIDINASE 
(DHPASE) dihydropyrimidinase 
- rat 

»gi|1378019|gnl[PP|dl0lO479 



P VALUEl 



422462 



3024834 



556 



U95098 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



557 



AE000392 



Escherichia coli K-12 
MG1655 section 282 
of 400 of the 
complete genome 



0.19 



0.19 



0.19 



1710445 



hypothetical protein - fruit fly 
(Drosophila melanogaster) 
>gi|296434 (X68408) ORF 
[Drosophila melanogaster] 



PROBABLE E4 PROTEIN 
>gi|790898 position 3286..3288 
s first start codon; putative 



(U78083) unknown [Emericella 
niduluns] 



3256922 



4226159 



3645960 



(AP000002) 256aa long 
hypothetical protein 
[Pvrococcus horikoshiil 



(API 25463) contains similarity 
to BTB (also known as BR- 
C/Ttk) domains (Pfam:PF0065 1 
Score=62.8, E=7.6e-15, N=l) 
Caenorhabditis elegansl 



(AL031583) I- 

evidence=predicted by content; 
l-method=geneflnder;084; 1- 
method_score=47.46; 1- 
evidence_end; 2- 
evidencc=predicted by match; 2- 
match_accession=SWISS- 
PROT.P23792; 2- 

match_description=DISCONNE 
CTED PROTEIN.; 2-matc... 



6.9 



5.3 



5.3 



5.3 



5.1 



4.1 



4.0 



(<ft 



WO 01/02568 
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Ojggfej Nearest Neighbor (B lastN vs. Genbank) 



SEQ 

I ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE 



558 | AEOOQ392 



559 1 L81774 



Escherichia coli K-12 
MG1655 section 282 
of 400 of the 
complete geno me 
-lomo sapiens 
(subclone 3_dl from 
PI H25) DNA 
sequence 



560 | AL02I108 



Drosophila 
melanogaster cosmid 
clone 137E7 



561 1 AB00I5I0 



Carabus 
Ieptoplesioides 
mitochondrial DNA 
for NADH 
dehydrogenase 
subunit 5, partial cds 



0.19 



0.19 



3645960 



4001725 



X^niJT5KJJT- — 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score=47.46; 1- 
evidence_end; 2- 
evidence=predicted by match; 2 
match_accession=SWISS- 
PROT:P23792; 2- 

match_description=DISCONNE 
CTED PROTEIN.; 2-matc... 



(AB015981)MnhA 
Staphylococcus aureus] 



4.0 



3.0 



0.19 



4001688 



Egernia stokesii clone 
562 | AF069696 EST1 microsatellite 0.19 



0.19 



3758855 



563 



X64144 



564 



U56897 



565 1 U57975 



F.pringlei ppcAl 
gene for 
phosphoenolpyruvate 

carboxylase 

Human 

immunodeficiency 
virus type 1 gag 
polyprotein (gag) 
gene, partial cds 



Danio rerio Notch 
homologue 3 rnRNA, 
complete cds 



3328994 



0.19 



3242974 



0.19 



2257710 



0.19 



3874971 



(AB015718) protein kinase 
[Homo sapiens] 



(298551) MAL3P6.1I 
Plasmodium falciparum] 



(AE001326) Amino Acid 
(Branched) Transport 
Chlamydia trachomatis] 



( AF069555) G protein-coupled 
receptor p2y3 [Meleagris 
gaHopavo] 



3.0 



2.4 



2.4 



(U73041) resol vase-like protein 
Thiobacillus ferrooxidans] 



K^IW) similar to NAD 
dependant 

epimerase/dehydratase family; 
cDNA EST EMBL:C10103 

:omes from this gene; cDNA 
EST EMBL.D66400 comes 
from this gene; cDNA EST 
EMBL:D70143 comes from this 

ene: cDNA EST yk493hl 1.3 
comes from ... 



2.3 



1.8 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION! DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



566 | Y 12502 



567 I S82470 



568 



U97408 



R.norvegicus mRNA 
for factor Xllla 



BBl=maiignant cell 
expression-enhanced 
gene/tumor 
progression-enhanced 
gene [human, UM- 
UC-9 bladder 
carcinoma cell line. 
mRNA, 1897 nt] 



Caenorhabditis 
elegans cosmid 
F48A9 



569 I U1Q47Q 



570 1 M88160 



Pseudomonas 
fluorescens PHA 
depolymerase (phaZ) 
ene. complete cds. 



Ovis aries MAF214 
locus polymorphic 
dinucleotide repeat . 

"J.IU1LLIS 1J.1UIUS 



0.19 



0.19 



0.19 



DESCRIPTION 



2133693 



2444026 



542433 



masquerade precursor - truit fly 



P value! 



(Drosophifa melanogaster) 
>gi|665545 (U18130) 
masquerade [Drosophila 
melanogaster] 

gi|1095942|prf]|2110286A 
masquerade gene 



(U77783) N-methyl-D-aspartate 
receptor 2D subunit precursor 
r Homo sapiens] 



25K protein - Babesia bovis 
(fragment) 



1.8 



1.8 



0.19 



571 | AJ131336 



572 



X84036 



mRNA for pollen 
allergen (Hoi 1 2, 
group II) > :: 
emb[AJ131339|LIT13 
1339 Lolium italicum 
mRNA for pollen 
allergen (Lol i 2, 
group II) > allergen 
(Poa p 2, group II) > 

emb|AJ13133S|TAEl 
3 1338 Triticum 
aestivum mRNA for 
pollen allergen (Tri a 
2. group II) 



0.19 



S.cerevisiae ARG8 
and CDC33 senes 



0.19 



0.19 



3721862 



1293816 



(ABO 1 6024) Pf]2 [Plasmodium 
falciparum] 



3880447 



(U56963)T13A10.5gene 
product [Caenorhabditis 
elegans [ 



1.8 



1.7 



1.4 



(AL032675) predicted using 
Genefinder 



3882041 |(AJ0 10405) hypothetical protein! 0.62 



0.82 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human WD protein 






mucin - human >gi|501 033 




573 


U57058 


ER10 pre-mRNA, 
partial cds 


0.19 


631302 


(U14383) mucin [Homo 
sapiens] 


0.60 


574 


AF034460 


Penicillium thomii 
internal transcribed 
spacer 1, 5.8S 
ribosomal RNA gene 
and internal 
transcribed spacer 2, 
complete sequence; 
and 28S ribosomal 
RNA gene, partial 
sequence 


0.19 * 


114136 


AMINO-ACID 
ACETYLTRANSFERASE 
Pseudomonas aeruginosa 

-^glj 1 J 1U JO \iVLj03DO) IN- 

acetyl glutamate synthase 
[Pseudomonas aeruginosa] 


0.35 


575 


! U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.19 


105270 


alpha-2 -adrenergic receptor - 
human name ADRA2R' [Homo 
sapiens! 


0.27 


576 


AG001475 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
125H6N2 


0.19 


94977 


hvnnrhr*rin 1 nrntpin ^ - 

Pseudomonas sp. (DSM 6898) 
plasmid pKB740 >gi|45867 
(X66604) ORF3 


0.16 


577 


M63284 


Mouse IgG receptor 
(beta-Fc-gamma-RII) 
gene, exons 9 and 10, 
clones lambda- 
Fc(3.2,93). 


0.19 


3024681 


TRANSCRIPTION 
INITIATION FACTOR TFIED 
135 KD S UB UNIT (TAFII- 135) 
(TAFII135) (TAHI-130) of 
RNA polymerase II transcription 
factor TFIID [Homo sapiens] 


0.088 


578 


U38241 


Pseudomonas 
aeruginosa orotate 
phophoribosyl 
transferase (pyrE), 
catabolite repression 
control protein (crc) 
and RNasePH (rph) 
genes, complete cds 


0.19 


3044086 


(AF055904) unknown 
Myxococcus xanthus] 


0.052 


579 


AF039734 


Lontra longicaudis 
transthyretin intron i, 
partial sequence 


0.19 


322759 


pistil extensin-like protein 
(clone pMG14) - common 
tobacco (fragment) >gi| 19927 
(Z 1401 5) pistil extensin like 
protein [Nicotiana tabacum] 


0.030 i 


580 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 


0.19 


2147194 t 


rollaizen - Paralvinella grasslei 


0.002 
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Nearest Neighbor [BlastN vs. Genbank ) 

seq" 

ID I ACCESSION 



581 | AB 004232 



582 | AF098919 



583 | AE001457 



584 I L1Q329 



585 I AE001155 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



Drosophila 
melanogaster raRNA 
for DAD polypeptide, 
complete cds 



Gallus gailus alpha- 
globin gene domain 5 



DESCRIPTION 



0.19 



I PEROXISOMAL MEMBRANE 
2498765 JPROTEIN PEX1 6 lipolytical 



region 



Helicobacter pylori, 
strain J99 section 18 
of 132 of the 
complete genome 



lasmid RP4 traE 
gene, 3' end; traD 
gene, complete cds; 
traF gene, 5' end. 



0.19 



1086863 



|(U41272)T03G11.6 gene 
[product [Caenorhabditis 
lelegans] 



0.19 



2924552 



(AL022018) 1- 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
rnethod_score= 165.48; 1- 
evidence_end; 2- 
evidence=predicted by match; 2 
jmatch_accession=AA264666; 2 
match_description=LD0835 L5p 
rime LP Drosophila melanoga.. 



0.19 



3878117 



586 



U49979 



587 



U88155 



588 I AF061854 



589 



M23865 



Borrelia burgdorferi 
section 41 of 70) of 
the complete genome 
Orf virus El OR 
homolog gene, partial 
cds, and DNA 
polymerase gene, 
complete cds 



(Z49068) mitochondrial carrier 
protein 



0.19 



861276 



Xenopus laevis 
RanGTPase 
activating protein 



Schizosaccharomyces 
pombe Clr4p (clr4) 
gene, complete cds 



(U28739) similar to TPR 
domains in e.g. yeast STT1 
protein [Caenorhabditis elegans 



P VALUE! 



0.002 



4e-05 



3e-05 



8e-07 



0.19 



3850072 



[(AL033385) dna-directed ma 
polymerase iii subunit 
[Schizosaccharomyces pombe] 



0.19 



995714 



(X9125S) pid:el98503 
[Saccharomyces cerevisiae] 



S.cerevisiae CHS2 
gene encoding chitin 
synthase. 



0.19 



3242750 



(AC005 164) match to ESTs 
AA73I149 (NID:g2140i38), 
AA731908 (NID:g2752719), 
AA2S7S37 (NED:gl9335 19), 
AA262S11 (NID:glS98382), 
and AAS25820 (NIP: g2 899 132) 



0.1S 



<NONE> 



<NONE> 



2e-12 



le-15 



4e-16 



5e-19 



<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 
SEQ " ' 

E> I ACCESSION 



DESCRIPTION 



590 | U95094 



591 I AF06761Q 



,592 | AFQ36329 



Xenopus laevis XL- 
INCENP(XL- 
INCENP) mRNA 
complete cds 
Caenorhabditis 
elegans cosmid 
F41A4 



P VALUE 



homo sapiens 
gonadotropin- 
releasing hormone 
precursor, second 
form (GnRH-II) gene 
complete cds 



593 I Z49216 



sapiens 
mitoxantrone- 
resistance associated 
mRNA 



594 1 X02167 



595 1 Z31561 



Torulopsis glabrata 
mitochondrial DNA 
for tRNA-Thr.-His 
and -GIu upstream of 
cytochrome b gene 



Nearest Nei - 
ACCESSION 



;hbor(BlastX vs. Non-Redundant Proteins) 



0.18 



0.18 



0.18 



0.18 



R. communis 
(Carmencita) Scri 
mRNA for sucrose 
carrier 



0.18 



596 1 L81692 



^omo sapiens 
(subclone 2_c9 from 
PI H56) DNA 
sequence 



597 I X57310 



598 | U18315 



Nocardia 

lactamdurans pcbAB 
and pcbC genes for 
alpha-aminoadipyl-L 
cysteinyl-D-vaiine 
synthetase and 
isopenicillin N 

synthase 

Sus scrofa ~~ 



0.18 



0.18 



parathyroid receptor 
(PTH) mRNA, 
complete cds 



0.1S 



DESCRIPTION 



, <NONE> 
<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



1346575 



126404 



<NONE> 



P VALUE 



<NONE> 



<NONE> 



<NONE> 



<NONE> 1 



<NONE> 



<NONE> 



55 KD ERYTHROCYTE 
MEMBRANE PROTEIN 



SEED LIPOXYGENASE-2 (L- 
2) soybean >gi|170014 (J0321 1) 
lipoxygenase (EC 1.13.11.12) 



8.4 



6.5 



0.18 



1022323 



(X04647) collagen alpha-2(IV) 
chain [Mus musculus] 



3.8 



WO 01/02568 



PCT/US00/18374 



SEQ 
ED 



Nearest Neighbor (BlastN vs. Genbank) 
ACCESSION 1 DESCRIPTION I p VALUE 



599 | AL010158 



600 I ABQ05287 



601 | AL021108 



602 | U57975 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins! 



ACCESSION 



Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 3-85, 
complete sequence 



Bos taurus mRNA for 
thrombospondin I, 
complete cds 



Drosophila 
melanogaster cosmid 
clone 137E7 



0.18 



2506816 



0.18 



2146000 



0.18 



Danio rerio Notch 
homoiogue 3 mRNA 
complete cds 



3483032 



603 



M30124 



604 1 X54965 



605 



U95098 



P.aeruginosa 
autonomously 
eplicating sequence. 



G.sp alpha 5HR DNA 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



0.18 



85719 



0.18 



3878017 



0.18 



134304 



606 



U20793 



Oryctolagus 
cuniculus renal 
sodium-dependent 
phosphate transporter 
type II mRNA, 
complete cds. 



0.18 



1628403 



DESCRIPTION 



|P VALUE! 



0. IS 



PRECURSOR 
PROTEOGLYCAN CORE 
PROTEIN 2) (GLIAL 
HY AL URON ATE-B INDING 
PROTEIN) (GHAP) >gi|608515| 
(U16306) chondroitin sulfate 
proteoglycan versican V0 splice-l 
variant precurso r peptide 
protein 

Mycobacterium tuberculosis 
tuberculosis] 

>gi| 1 694S63|gnl|PID|e283373 
(ZS30IS) hypothetical protein 
Rv2968c [Mycobacterium 
tuberculosis! 



(AL031371) hypothetical 
protein SC4G2.06 
Streptomyces coelicolorl 



collagen alpha l'(U) chain 
precursor - African cla wed frog 
( ALui 1 J87) similar to .Zinc 



finger, C4 type (two domains); 
cDNA ESTyk452f4.5 comes 
from this gene; cDNA EST 
EMBL:T00774 comes from this 
gene receptor NHR-3 
[Caenorhabditis elegans] 



STEM CELL PROTEIN 
chicken >gi|62845 (X63371) 
transforming capacity [Gall us 
sallusl 



(X9SS93) hTAFII68 [Homo 
sapiens] splicing [Homo 
sapiens] 



1705984 



92 KD TYPE IV 
COLLAGENASE 
PRECURSOR IV, 92K, 
precursor - rat >gi| 10227 84 
(U36476) 92-icDa type IV 
collagenase [Rattus norvegicus] 



3.7 



2.9 



2.9 



1.7 



1.3 



1.3 



1.3 



1.2 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Oenhnnk-l 



_*D 1 ACCESSION! DESCRIPTION 



608 I U49953 



609 1 J0Q182 



610 



611 



X62513 
X04862 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



Human 
cholecystokinin type 
A receptor (CCK-A) 
:ene, exons 1 and 2. 

Rattus norvegicus 
protein kinase MUK2 
mRNA, complete cds 



Human alpha globm 
gene cluster on 
chromosome 16: zeta 
gene. 



612 1 M12450 



613 | AF038539 



M.gallopavo gene for 

metallothionein 

Goat embryonic alpha 

lobin gene zeta 
exons 2-3 
Rat vitamin D 
binding protein 
(DBP) mRNA, 
complete cds. 



ACCESSION 



0.18 



0.18 



0.18 



Mus musculus muscle 
NSP-like 1 (Nspll) 
mRNA, complete cds 



0.1S 



0.18 



0.18 



614 1 X78401 



615 



D 3 8754 



Bacteriophage P22 
right operon, orf 4S, 
replication genes 18 
and 12, nin region 
genes, ninG 
phosphatase, late 
control gene 23, orf 
60, complete cds, late 
control region, start 
of lysis gene 13 



0.18 



Pig mRNA for inter- 
alpha-trypsin 
inhibitor heavy-chain 
HI. complete cds 



0.18 



DESCRIPTION 



3261734 



551238 



(Z94752) hypothetical protein 
|Rv 1004c 



(X81847) pectate lyase 1 
[Erwinia carotovora] 



traJ gene [Amycolatopsis 
J585259 I methanol ica] 



HYPOTHETICAL 28.3 KD 

PROTEIN IN GBD 5'REGION 
(ORF4) >gi|2120954|pir||I39562 
ORF4 - Alcaligenes eutrophus 
.2494740 >gi(695274 0L36817) ORF4 



86837 androgen receptor B - human 



(AJ130783) APC2 protein [Mus 
4210432 musculus] 



p value! 



0.97 



0.43 



0.41 



3297877 



(AJ224S6S) GNAS1 [Homo 
sapiens) 



1123087 



0.31 



0.08: 



0.038 



0.029 



(U42436) C49H3.3 gene 
product [Caenorhabditis 
elegans] 



0.1S 



1397275 



(U61947) C06G3.8 gene 
product [Caenorhabditis 
elegans] 



0.009 



7e-06 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor [BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor fBIastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



ILRR47 protein - fruit fly 



P VALUE! 



616 



X51508 



Rabbit mRNA for 
aminopeptidase N 
(partial) 



0.18 



630864 



617 



X54850 



S.kluyveri linear 
plasmid pSKL DNA 
for open reading 
frames 1-10 



0.18 



3183405 



618 



L21954 



619 I UQ9355 



620 1 X58715 



621 1 AF060195 



622 



L27235 



Human peripheral 
benzodiazepine 
receptor gene, exon 4 
Oryctolagus 
cuniculus protein 
phosphatase 2A1 B 
gamma subunit 
(skeletal muscle 
isolate) mRNA, 
complete cds. 



T.cruzi hsp70 mRNA 
for 70 kDa heat shock 
protein, partial cds 



0.18 



0.18 



Mus musculus 
proteasome regulator 
PA28 beta subunit 
gene, complete cds 



Methylobacterium 
extorquens serine 
cycle proteins 



0.18 



0.18 



0.IS 



3925211 



3947877 



3024081 



(Drosophila melanogaster) 
>gi|4 15947 (X75760) LRR47 
[[Drosoph ila melanogaster] 



T H^PU lRbl lLAL TTTKD 
PROTEIN C2C6.07 IN 
CHROMOSOME I 
>gi|2370504|gnl(PID|e339194 
pombe] 

>gi|345 1 305|gnl|PID|e 1 3 1 6730 
(AL031324) very hypothetical 
Jprotein [Schizosaccharomyces 
pombe] 



6e-07 



2e-08 



( ALUj>2b2bj cDN A EST" 
|EMBL:D70654 comes from this I 
gene; cDNA EST 
EMBL:2 14359 comes from this I 
gene; cDNA EST 
EMBL:D33409 comes from this I 
gene; cDNA EST 
EMBL:D36239 comes from this I 
gene; cDNA EST 
EMBL:Z14766 comes from this 
Laene... I 4 e .Q9 



(AL034382) putative mitosis 
land maintenance of ploidy 

protein [Schizosaccharomyces 
Ipombel 



I MYOSIN LIGHT CHAIN 
KINASE, SMOOTH MUSCLE 
AND NON-MUSCLE 
I ISOZYMES (MLCK) 
(CONTAINS: TELOKIN) 



8e-ll 



9e-12 



861276 



(U28739) similar to TPR 
domains in e.g. yeast STI1 
protein [Caenorhabditis elegans] 



le-14 



(AF0272O8) AC 133 antigen 
2688949 [[Homo sapiens] 



le-14 



WO 01/02568 



PCT/US00/18374 



.Nearest Neighbor TBIastN vs. Genbank) 



SEQ 

m 1 accession! description 



623 1 AF0Q6573 



624 I AF001782 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



|P VALUE! 



Drosophila virilis 
maltase 1 (Mavl) and 
maltase 2 (Mav2) 
co mple te cds 



genes, 



Staphylococcus 
aureus strain SA502A 

AgrB 

Homo sapiens germ 



626 | J03Q59 



line DNA upstream of 
Jkappa locus 
Human 

glucocerebrosidase 
(GCB) gene, 
complete cds 



0.13 



2500558 



ruiATIVE RUB ON UCLE ASE 
III (RNASE III) 

>gi|3876420|gril|PID|el346063 
(23 1 070) similar to ribonuclease 
[Caenorhabditis elegans] 



0.17 



<NONE> 



<NONE> 



0.17 



0.17 



627 I ABQQ8860 



628 | AF027174 



629 | AF059650 



Fugu rubripes Cal2 
gene for pheromone 
receptor, complete 
cds 

Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



<NONE> 



<NONE^ 



<NONE> 



<NONE> 



0.17 



2198849 



" lAWJU4yiX)j h JkARP [Homo ' 
sapiens] >gi|2665826 
(AF035771) Na+/H+- exchanger 
regulatory factor 2 [Homo 
sapiens] factor 2 [Homo 
sapiens] 

>gi|3618353|gnl|PED|dl0341S2 
[exchanger isoform A3 [Homo 



2e-23 



<NONE> 



<NONE> 



<NONE> 



Homo sapiens histone 
deacetylase 3 
(HDAC3) gene, 
complete cds 



0.17 



539355 



SCD25 protein (version 1) 
veast 



0.17 



482118 



hypothetical protein C15H7.1 - 



Caenorhabdit is elegans 



7.8 



7.5 



4.5 



WO 01/02568 



PCT/US00/18374 



mm 



SEQ 



Nearest Neighbor TBlastN vs. Genbank) 



E> I ACCESSION DESCRIPTION 



P VALUE 



Nearest Neighbor jgjjjjx vs. Non-Redundant IWin.T 



ACCESSION 



DESCRIPTION 



630 1 AF05965Q 



631 



X55065 



632 I U1528Q 



Homo sapiens histone 
deacetyiase 3 
(HDAC3) gene, 
complete cds 



Chinese hamster 
metallothionein II 
gene 



0.17 



465932 



PROTEIN F58A4.11 IN 
CHROMOSOME III 
>gi|3874287|gni|PED|e 1344088 
EST EMBL:C 12577 comes 
from this gene; cDNA EST 
yk227e7.5 comes from this 
gene; cDNA EST yk303dl.5 
comes from this gene; cDNA 
EST yk3I4cl2.5 comes from 
this gene; cDNA ... 
EMBL:C1 1886 comes from this 
gene; cDNA EST 
EMBL.-C 12577 comes from this 
gene; cDNA EST yk227e7.5 
comes from this gene; cDNA 
EST yk303dl.5 comes from this 
gene; cDNA EST yk3 14c 12.5 
comes from this gene: cDNA . 



Rattus norvegicus 
oxytocin receptor 
(OTR) gene T exon 3 
and co mple te cds 



633 1 X04862 



634 I ALO 10222 



635 



X60111 



636 



U49979 



Goat embryonic alpha 
globin gene zeta 
exons 2-3 



Plasmodium 
falciparum DNA ** 
SEQUENCING IN 
PROGRESS *** 
from contig 4-09, 
complete sequence 

H.sapiens mRNA for 
MRP- 1 



0.17 



3687237 



0.17 



542565 



p value! 



(AC005169) putative Cys3His 
zinc -finger protein 



0.17 



86837 



yclin E type II - fruit fly 
(Drosophila melanogaster) 

gi|429l68 (X75027) 
Drosophila cyclin E type II 
[Drosophila melanogasterl 



4.4 



1.5 



androgen receptor B - human 



Orf virus El OR 
homolog gene, partial 
cds, and DNA 
polymerase gene, 
complete cds 



0.17 



0.17 



1177322 



3237306 



(X95466) CPG2 protein [Rattus 
norvegicus] 

>gi|l588593|prf||2208498A 
plasticity-related gene [Rattus 
norvegicus] 

(U92715) breast cancer 
antiestrogen resistance 3 protein 



0.17 



3850072 



0.45 



0.080 



(AL033385) dna-directed rna 
polymerase tii subunit 
Schizosaccharomyces pombe] 



7e-07 



7e-15 



WO 01/02568 



PCT/US00/18374 





\ Nearest Neighbor fBlastN vs. Genbank) 


JL Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
LD 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


p vat rrF 
















637 


AF006573 


Drosophila virilis 
maltase 1 (Mavl) anc 
maltase 2 (Mav2) 
genes, complete cds 


0.17 


2500558 


PUTATIVE RIB O NUCLEASE 
III (RNASE III) 
>gi|3876420|gnl|PID|el346063 
(Z81070) similar to ribonuc lease 
[Caenorhabditis elegans] 


2e-29 


638 


AE001141 


Borrelia burgdorferi 
(section 27 of 70) of 
the complete genome 


0.16 


1850327 


(U52370) fertilin beta [Homo 
sapiens] 


2.3 


639 


M72980 


Anthonomus grandis 
vitellogenin gene 
(VTG), complete cds. 


0.12 


3242750 


(AC005 164) match to ESTs 
AA73 1 149 (NTD* 22140138^ 
AA73 1908 (NID:g2752719), 
AA287837 (NIDrg 19335 19), 
AA2628U (NID:gl898382), 

nnH A AS^fl^O fMTTYo^RQQI 


ze-_)o 


640 


AF023532 


Simulium vittatum 
ATPase 6 gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


0.11 


<NONE> 


<NONE> 


<NONE> 


641 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


0.10 


3482965 


(AL03 1369) putative protein 


0.49 


642 


AJ00I596 


Danio rerio mRNA 
for opioid receptor 
homolosue 


0.099 


1706694 


LANOSTEROL SYNTHASE 
(Schizosaccharomyces pombe) 


2.3 


643 


U26341 


Oryctolagus 
cuniculus Na and CI 
dependent betaine 
transporter mRNA, 
complete cds. 


0.099 


2645804 


(AF033381) betaine 
lomocysteine methyl transferase 
[Mus musculus] 


0.59 


644 


Ml 1633 i 


Bacteriophage Cp-5 
[S. pneumoniae) 3' 
nverted terminal 
-epeat. 


0.082 


( 

2314695 t 


AE000649) type IIS restriction 
inzyme R and M protein 


4.3 


645 


I 

X74103 < 


Streptomyces sp. 
*ene for alkaline 
serine protease I 


0.073 


( 

1314734 f 


U54641) 220 kDa silk protein 
Chironomus thummi] 


6.3 



1AQ 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbankj 

seq" 

ID 1 ACCESSION | DESCRIPTION P VaL IJF 
Caenorhabditis 



646 I Z72509 



elegans cosmid 
F32G8, complete 
sequence 
[Caenorhabditis 
elegans] 



648 1 Z699Q6 



649 J AF056940 

650 1 AJ001I51 



651 X54455 



652 I X87936 



653 AFQ19236 



X.laevis xanf-1 ge ne 

'uman DNA 
sequence from 
cosmid E141E2, on 
chromosome 22, 
complete sequence 
'Homo sapiens] 



0.072 



0.070 



0.069 



Drosophila virilis 
retrotransposon Tvl, 
complete sequence 
Homo sapiens 
genomic sequence 
Bacteriophage BF23 
gene 17 and gene 18 



P.pinea internal 
transcribed spacers 1 
& 2 of ribosomai 
DNA 



0.069 
0.068 
0.067 



Dicryosteiium 
discoideum TipD 
(tipD) gene, complete 
Icds 



654 I X90592 



655 I U41805 



656 I ABOQ7881 



657 1 AL0I0213 



O.cuniculus mRNA 
fo r p53 protein 



Mus muscuius 
putative T1/ST2 
receptor binding 
protein precursor 
mRNA. partial cds 



Homo sapiens 
KIAA0421 mRNA, 
partial cds 

iasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 3-109, 
[complete sequence 



0.067 



0.067 



0.067 



0.067 



0.066 



Nearest Neighbor fBlastX vs. Non-Redundant PrT^T 



DESCRIPTION 



<NONE> 



<NONE> 



3851202 



<NONE> 



(AC005954) ZO-3 [Homo 
sapiens] fHomo sapiensl 



<NONE> 



0.40 



<NONE> 



2246532 
<NOKE> 
<NONE> 



(U93S72) ORE 73, contains 
large complex repeat CR 73 



<NONE> 
<NONE> 



2459733 



(U95374) aldehyde 
dehydrogenase [Haloferax 
jvolcanii] 



<NONE>[ 



5e-12 
<NONE>[ 
<NONE> 



3882275 



(.ABO 18320) KIAA0777 protein 
[Homo sapiens] 



METHIONINE 
AiVIINOPEPTIDASE 2 
(ME TAP 2) GLYCOPROTEIN) 
1703275 J(P67) 



(U 17326) neuronal nitric oxide 
642518 synthase [Homo sapiens! 



<NONE> 



4.3 



1.1 



0.29 



0.29 



0.066 



<NONE> 



<NONE> 



<NONE> 1 



<NONE> 1 



1A\ 



WO 01/02568 



PCT/US00/18374 



3^g| Nearest Nei ghbor j B lastN vs. G enbank) 
SEQ 

DESCRIPTION 



HP I ACCESSION 



658 1 AB014546 



659 I AF 104 156 



660 



661 



X97581 



D85378 



P VALUE 



Homo sapiens mRNA 
for KIAA0646 
rotein, complete cds 

Rattus exulans isolate 
huahine30 
mitochondrial D- 
loop, partial sequence 



Nearest Neighbor (BlastX vs. Non-Redundant FW^T 



ACCESSION 



0.066 



0.066 



662 1 M97561 



663 j AE0QI373 



M.musculus mRNA 
for spalt transcription 

factor | 0.066 

Human clone H20 N- 
acetylglucosaminyltra 
nsferase III DNA, 
exon 2 0.066 



Human (clone 
LA 179) chromosome 
21 sequence. 



1082461 



1002380 



DESCRIPTION 



I P VALUEI 



4107313 



2114473 



homeotic protein HB9 - human I 0.38 



(U24189) RRM-type RNA 
binding protein [Caenorhabditis 
elegans] | 0.29 



Plasmodium 
falciparum 
chromosome 2. 
section 10 of 73 of 
the complete 
sequence 



0.065 



(AL035075) putative myosin 
heavy chain 



(U96963) pl40mDia [Mus 
musculusl 



0.28 



0.22 



<NONE> 



665 1 AF032922 



growth hormone 
receptor, growth 
hormone binding 
protein {GHR/BP 
gene} [mice, C57 
black/6, Genomic, 
179 nt, segment 8 of 
10] 

Homo sapiens 
syntaxin 4 binding 
protein UNC-iSc 
(UNC-18cj mRNA, 
complete cds 



0.065 



666 I SS0986 



<NONE> 



<NONE> 



svp[40J=svp-related 
nuclear 

receptor/retinoid 
signaling modulator 
zebrafishes. mRNA 
3S76 ntl 



0.065 



0.065 



<NONE> 



3061308 



0.065 



13262SS 



<NONE> 



I <NONE> 



<NONE> <NONE> 



(AB 006074) topoisomerase III 
Mus musculusl 0.82 



(U5S734) weak similarity to 
ankyrin G [Caenorhabditis 
legans] 



0.12 



2* 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor TBlastN vs. G enbank) 

SEQ ' 

g> I ACCESSION 



DESCRIPTION 



667 | X59552 



668 [ M7298Q 



669 | ABO 14546 



P VALUE 



G.domesticus mRNA 
for ventricular myosin 
heavy chain 



Anthonomus grandis 
vitellogenin gene 
(VTG), complete cds. 



Nearest Neigh bor (BlastX vs. Non-Redundant FWl ^TT 
ACCESSION 



DESCRIPT ION 
HiPUlHhllLAT 



0.065 



2497098 



670 1 M30039 



671 



Z68013 



672 | AFQ41332 



673 



J00451 



Homo sapiens mRNA 
for KIAA0646 
protein, complete cds 



0.065 



3242750 



INTERGENIC REGION 
•gi|1077l80|pir||S49745 
probable membrane protein 
YML034w - yeast 
(Saccharomyces cerevisiae) 
>gi|575685 (Z46659) unknown 
orf, len: 656, CAI: 0.13 
[Saccharomyces cerevisiae 

(AC005164) match to ESTs 
AA73 1 149 (NID:g2140138), 
AA731908 (NID:g2752719), 
AA287837 (NID:gl933519), 
AA262811 (NrD:gl898382),' 
and AA825820 (NTD:g2899132 > ) 



P VALUEl 



0.014 



Sheeppox virus strain 
KS-1 ORFHMI 
gene, partial cds; 
ORF HM2 and ORF 
HM3 genes, complete 
cds; and ORF HM4 
g ene, partial cd s 
aenorhabditis 
elegans cosmid 
W02H3, complete 
sequence 
Caenorhabditis 
elegans] 

Bodo saltans 
unknown mRNA, 

inetoplast gene 
encoding kinetoplast 
jrotein, complete cds 



0.064 



<NONE> 



<NONE> 



0.064 



<NONE> 



0.064 



<NONE> 



Mouse germline IgG 
chain gene, D-J-C 
gion. and switch 

region. 



0.064 



<NONE> 



<NONE> 



5e-3: 



<NONE> 



<NONE> 



0.064 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



674 



675 



676 



Nearest Neighbor (BlastN vs. Genbankl 



ACCESSION 



U41289 



DESCRIPTION 



P VALUE 



M37395 



Z15O30 



677 



678 



Z12021 



L05668 



Dicryostelium 
discoideum K7 
kinesin-like protein 
mRNA, complete cds 
L.Iactis (strain SK11) 
proteinase plasmid 
pSKIll DNA, 
complete cds. 



H.sapiens gene for 
ventricular myosin 
light chain 2 > :: 
gb|L01652|HUMVM 
LC Human 
ventricular myosin 
light chain 2 gene, 
seven exons. 



Nearest Neig hbor (BlastX vs. Non-Redundant Prote ins) 
ACCESSION ! DESCRIPTION 



0.064 



0.064 



3482972 



993019 



0.064 



730343 



G.max gene for 
catalase 



.Entamoeba histolytica 
protein 

serine/threonine 
kinase (pstkl) gene, 
complete cds. 



0.064 



2498711 



0.064 



733140 



|(AL031369) putative protein 



(X87246) alternative start codon 

([Pseudor abies virus] 
[PKULAL 1 IN KElhP 1 UK 



I PRECURSOR (PRL-R) mouse 
>gi |2205 76|gnI|PID|d 1 00 1 5 35 
(D10214) prolactin receptor 
precursor [Mus musculus] 
>gi|293770 (L148I1) prolactin 
receptor [Mus musculus] 
>gi|347842 (L13593) prolactin 
receptor [Mus musculus] 
receptor:ISOTYPE=Iong form 
[Mus musculus] 

ORIGIN RECOGNITION 
COMPLEX PROTEIN, 
SUB UNIT 2 >gj| 11 85461 
(U3S472) essential ORC2- 
related fission replication factor 
Orp2 [Schizosaccharomyces 
pombe] 



p value! 



(U22453) carboxypeptidase 
[Simulium vittatum] 



9.3 



9.2 



9.1 



5.3 



5.3 



WO 01/02568 



PCT/US00/18374 



SEC 
ED 


H Neares 
1 

ACCESSlOl 


t Neighbor (BlastN vs. 
V DESCRIPTION 


Genbank) 
J PVALUt 


Nearest Nei^l 

s ACCESSION 


ibor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION P vat rrp 


679 


U50715 


Mus musculus alpha- 
galactosidase A gene, 
complete cds 


0.064 


! 125398 


HYGROMYCEN-B KINASE 
(HYGROMYCIN B 

PHOSPHOTRANSFERASE) 

(APH(7")) 

>gi|66885(pir||WGSMHH 
hygromycin B 

phosphotransferase (EC 2.7.1.-) 
Streptomvces hveroscoDicn*; 
>gi|581682 (X03615) pot. hyg 
protein [Streptomyccs 
h V^roscoDicu^l 
phosphotransferase [synthetic 
construct] >gi|2739064 cloning 
vector] >gi|2739068 
^.-vrwzj i+i ) nygromycm a 
phosphotransferase [unidentified 
clonins vector] 


1 

2.3 j 


680 


Z28182 


S.cerevisiae 
chromosome XI 
reading frame ORF 
YKL182w 


0.064 


1079035 


Om{2D) protein - fruit fly 
vi^rubopniia ananassae) 
>gi|443770|gnI|PID|d 1006095 
(D26553) ORF 


! 1.8 


00 1 | 


M29917 


Human ornithine 
aminotransferase 
gene, exon i. 


0.064 


2317934 


(U97553) unknown [murine 
herpesvirus 68] 


1.4 j 


682 1 


AB020709 


Homo sapiens mRNA 
for KIAA0902 
protein, complete cds 


0.064 


861404 


(U29154) T07F12.3 gene 
product [Caenorhabditis 
slegans ] j 


0.47 j 


683 1 


ABO L 4546 


Homo sapiens mRNA 
for KIAA0646 
protein, complete cds 


0.064 I 


J 

1708118 


TOMEOB OX PROTEIN HB9 
>gi|507425 ■ 1 


0 35 j 


684 I 


1 

ABO 10427 c 


-lomo sapiens mRNA " 
or NORM, complete 
:ds 


0.064 


( 

2388676 f 


AF0 15539) precollagen P 
Mytilus edulis] [ 


0.018 


685 I 


C 
r 
h 
h 

U34774 c 


Drf virus ankyrin-Iike 
epeat protein, Fl 1L 
omolog, and F12L 
omolog genes, 
omplete cds. 


0.064 J 


S 
> 
P 

731668 c 


SFi PROTEIN 
gi|626624|pir||S46700 SSF1 
rotein - yeast (Saccharomyces 
erevisiae) 


le-05 j 


686 j 


^ 

n 
rr 

AF022861 si 


lus musculus 
europilin-2(a5) 
iRNA, alternatively 
sliced, complete cds 


0.064 1 


U 
d< 

4091978 s£ 


\F073359) benzaldehyde 
-hydrogenase [Pseudomonas 
). DJ77] 


le-05 j 



1A6 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



687 



688 



Nearest Neighbor (K^tN vs . 



ACCESSION 



U1433I 



AF074870 



689 



Z25523 



DESCRIPTION 



Ge nbank) 
P VALUE 



Sus scrofa myogenin 
ene. comp lete cds 

Chironomus 
pallidivittatus clone 
1219 non-telomeric 
Ssp repeat sequence 



690 | AEQQ137S 



691 



Z72947 



692 



Y 14723 



H.sapiens repeat 
r egion DNA 
lasmodtum 
falciparum 
chromosome 2, 
section 15 of 73 of 
the complete 
sequ ence 



S.cerevisiae 
chromosome VII 
reading frame ORF 
YGR162w 



Choanomphalus 
incertus 
mitochondrial 
cytochrome c oxidase 
subunit I gene, partial 



Nearest Neighbor (BlastX vs. Non-Redundant PrnTZ^T 



ACCESSION 



DESCRIPTION 



0.064 



0.063 



2781386 



<NONE> 



0.063 



<NONE> 



(ACUU4Q10) similar to Leucm e" 
rich transmembrane proteins 
44% similarity to U42767 
(PID:gl736918) [Homo 
sapiens] 



<NONE> 



|P VALUE I 



<NONE> 



0.063 



<NONE> 



0.063 



<NONE^ 



693 1 x74103 



694 | AF039843 



Streptomyces sp. 
gene for alkaline 
serine protease I 



0.063 



<NONE> 



<NONE> 



<NONE> 



0.063 



1730713 



Homo sapiens 
Sprouty 2 (SPRY2) 
mRNA. complete cds 



0.063 



232217 



< NONE> 
H t PU 1 Mb 1 1L AL lU8.i KJJ 

PROTEIN IN UME3-PUB 1 

INTER GENIC REGION 

>gi|2131866|pir||S62935 

hypothetical protein YNL023c 

yeast (Saccharomyces 

cerevisiae) 

1301855|gnljPID|e239870 
(271299) ORF YNL023c 

Saccharomyces cerevisiae] 
GL L 1 A 1 Hi Ui\E S- 

TRANSFERASE GST-6.0 
(GST B 1-1) 

>gi|42119S|pir||S29772 
glutathione transferase (EC 
2.5.1.18) - Proteus mirabilis 
>gi|2l26142|pir(|S718S2 
glutathione transferase (EC 
'2.5.1.18) B - Proteus mirabilis 
>gi| 1053076 (U38482) 



14 (p 



3e-33 



<NONE> 



<NONE> 



<NONE> 



<NONE> I 



6.7 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor fBlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant PrnT^T 



Mouse M-twist gene 
695 J M6365Q ImRNA. complete cds. 



696 



697 



Y13298 



X56600 



698 



Homo sapiens GDP 
dissociation inhibitor 
beta pseudogene 

RatSOD-2 gene for 
manganese- 
containing superoxide 
dismutase 



ACCESSION 



0.063 



Z23107 



699 | M20670 



M.musculus mRNA 
for 5HTx serotonin 
receptor 



0.063 



0.063 



Plasmodium vivax 
circumsporozoite 
protein gene, 3' end. 



0.063 



701 I U95 094 



702 U95Q9S 



H.sapiens CpG DNA 
clone 76gl I, reverse 
read cpg76gl l.rt la . 
Xenopus iaevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 
Xenopus iaevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 



_703 1 U95094 



0.062 



Xenopus Iaevis XL 
INCENP (XL- 
INCENP) mRNA. 
complete cds 



0.063 



0.063 



0.062 



0.063 



DESCRIPTION 



1730141 



1085930 



RETARDATION SYNDROME 
RELATED PROTEIN 2 
>gi|2135129|pir||S60173 fragile 
X mental retardation syndrome 
related protein - human 
>gi| 1098637 (U3I501) fragile X 
mental retardation syndrome 
related protein [Homo sapiens] 



P VALUE 



1.8 



hypothetical protein 4 
adenovirus 1 



fowl 



3882143 



1708162 



4033395 



(AB01S254) KIAA071I protein 
fHomo sa piens 



HUNTINGTLN 
(HUNTINGTON'S DISEASE 
PROTEIN HOMOLOG) (HD 
PROTEIN) 



DNA GYRASE SUB UNIT B 
subunit [Myxococcus xanthusl 



135091 J 



2981200 



3877951 



RL11N01L' ALLL> RECEPTOR 

RXR-BETA sapiens] 
>gi|3 172498 ( AF065396) 
retinoic X receptor B 
dJ1033BlO.il (Retinoid X 
receptor beta (RXRB)) [Homo 
sapiens] >gi|4249766 
(AF120161) retinoic X receptor 
beta 



(AF04S732) cyclin T2b [Homo 
sapiens! 



(ZS1555) predicted using 
Genefinder 



1.3 



0.60 



0.45 



0.35 



339301S 



(AL031 174) hypothetical 
protein 



0.16 



0.090 



6e-07 



2e-10 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



704 1 D9Q872 



705 | M25528 



706 1 U45256 



707 I U95102 



708 I AF044317 



P VALUE 



E.coli genomic DN A 
Kohara clone 
#419(34.7-55.1 mm.) 

M.crystaJlinum 
ferredox i n-NADP+ 
reductase (fnrA) 
mRNA, complete c ds. 

Strongyioides ratti 
microsatellite B DNA 



Xenopus Iaevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 



Homo sapiens 
TEL/AML1 fusion 
[ene, partial seq uence 
!aenorhabditis 
elegans cosmid 
T06ES, complete 
sequence 
[Caenorhabditis 
709 | 273975 lelegansl 



Nearest Neighbor (BlastX vs. Non-Redundant Prnr^n^ 



ACCESSION 



0.063 



0.062 



0.062 



0.062 



0.062 




Human mRNA for 
heparan sulfate 
proteaglycan 



0.062 



Bovine retinal mRNA 
for transducin beta- 
subunit 



D.melanogaster Jun 
and 14-3-3 zeta gene 
Bom bus terrestris 
mitochondrial 
cytochrome oxidase I, 
partial cds. 



0.062 



0.062 



0.062 



0.062 



DESCRIPTION 



2498198 



<NONE> 



P value! 



CYTOCHROME B561 
(CYTOCHROME B-561) 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



31081S7 



1076741 



477578 



3879551 



1684959 



<NONE> 



(AC004663) Notch 3 [Homo 
sapiens] 



chitinase (EC 3.2.1.14) 
precursor - rice precursor - rice 
>gi|S07955 (X87109) chitinase 
Oryza sativa] 



3e-19 



<NONE> I 



<NONE> 



<NONE- 



2.9 



sialidase - Actinomyces viscosus 
>gi[141852 



(Z70756) similar to collagen 



(U20600) NADH 
dehydrogenase subunit [Vanda 
lamellata] 



0.59 



0.087 



0.073 



0.039 



WO 01/02568 



PCT/US00/18374 





SEi 
ID 


^1 Neares 

3 

_|ACCESSIO 


t Neighbor (BlastN vs. 

n| description 


Genbank) 
P VALUI 


Nearest Neip 

i ACCESSION 


hbor(BlastX vs. Non-Redundant 
DESCRIPTION 


Proteins) 1 
P VALUE 


714 


j U58994 


Human Iadinin fLAD 
gene, complete cds 


) 

0.062 


1 2811078 


AiWINUPhP 1 LUASE~B — === 

( ARGLN V L 

AMINO PEPTIDASE) 
(ARGINTNE 
AMINOPEPTID A <sF^ 
(CYTOSOL 

AMINOPEPTID ASE IV) (AP- 
B)>gi|2039143 (U61696) 
aminopeptidase B [Rattus 
norvesicusl 




715 


1 ABO 14553 


Homo sapiens mRN^ 
for KIAA0653 
protein, partial cds 


0.062" 


I 1326350 


(U58748) similar to potential 
transmembrane domains in S. 
cerevisiae nulcear division 
RET1 protein (SP:P38206> 


9e-06 j 


716 


L16898 


Mus musculus 
collagen aJpha 1 type 
XVIJJ mRNA, 5'end. 


0.062 


1723657 


H yPU lHE 11LAL 38.5 KD 

PROTEIN IN ERV1-GLS2 
INTERGENIC REGION 
>gi|2 132587iDirllS64^?' - > 
probable membrane protein 
YGR03 1 w - yeast 
(Saccharomyces cerevisiae) 
>gi|1323010|gnl|PID|e243277 

[Saccharomyces cerevisiae] 


Se-10 J 


717 1 


X99343 


M.tuberculosis 
guaA/B & choD 
genes 


0.062 I 


3873807 


(Z49907) B 049 1.1 
Caenorhabditis elegans] 


le-14 j 
2e-I9 1 




718 I 


3 
i 
( 

AF010193 c 


Homo sapiens MAD- 
-elated gene SMAD7 
SMAD7) mRNA, 
romplete cds 


0.061 


<NONE> 




<NONE> f 




719 J 


L10182 r 


tfyrmeleon sp. 18S 
ibosomal RNA. 


0.061 


<NONE> 


<NONE> 

<NONE> 


<NONE> J 




720 1 


C 
ii 

IT 

C 

Y 14723 si 


^hoanomphalus 
icertus 
litochondrial 
ytochrome c oxidase 
ibunit I ?ene, partial 


0.061 


<NONE> 


<NONE> 


:NONE> 1 


, 1 


71 | 


B 

s> 
ni 

L27840 cc 


ovine respiratory 
'ncytial virus 
icleoprotein mRNA. 
>mplete cds. 


0.061 1 


542955 m 


cleoporin p62 - human 


8 6 1 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



Nearest Neiphhor (BlastX vs. Non-Redundant Proteins) 



P VALUE | ACCESSION 



722 | U95Q94 



723 | U95Q98 



724 | U26463 



Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partia l cds 
Sporidiobolus 
salmonicolor 
NADPH-dependent 
aldehyde reductase 
[gene, complete cds 



0.061 



494454 



0.061 



0.061 



3845272 



1710288 



725 I AF035443 



Xenopus laevis weel 
hornolog mRNA. 
complete cds 



726 



Z48584 



Caenorhabditis 
eJegans cosmid 
ZK1321, complete 
sequence 
[Caenorhabditis 
elegans] 



0.061 



3979720 



0.061 



3183491 



DESCRIPTION 



ius scrora 

, 'Si|4y44DD|pdb|lPUS|B Sus 

scrofa Sus scrofa 

>gi| 142121 0|pdb| 1PCP| Porcine 

Spasmolytic Protein (Psp) (Nmr, 

19 Structures) Spasmolytic 

Polypeptide 

>gi|I 63306 l|pdb|2PSP|B Chain 
B, Porcine Pancreatic 
Spasmolytic Polypeptide 



(AE001417) hypothetical 
protein [Plasmodium 
falciparum] 



P VALUE! 



(U79302) unknown [Homo 
sapiens] 

' 1 J ' lllllW 



EMBL:D33048 comes from this 
gene; cDNA EST 
EMBL:D35780 comes from this 
gene; cDNA EST yk442c6.3 
comes from this gene; cDNA 
EST yk442c6.5 comes from this 
gene; cDNA EST yk398f6.3 
comes from this sjene; cDNA 
E... 

>gi|39798 16|gnl|PID|el3583 15 
EST EMBL:D35780 comes 
from chis gene; cDNA EST 
yk442c6.3 comes from this 
gene; cDNA EST yk442c6.5 
comes from this gene; cDNA 
EST yk398f6.3 comes from this 
gene; cDNA E... 
HYPOTHETICAL S3.SKD 



PROTEIN C27F2.7 IN 
CHROMOSOME III 
>gi| 10655 10 (U40419) C27F2.7 
gene product [Caenorhabditis 

ejeeans] 



2.9 



1.3 



0.44 



2e-04 



3e-ll 



WO 01/02568 



PCT/US00/18374 



I SEQ 

DP lACCESSrON 



Nearest Neighbor (BlastN vs. Genbank) 



727 I X61489 



728 AF02540S 



DESCRIPTION 



P VALUE 



Nearest Neighb or (BlastX vs. Non-Redundant Prot5S) 



Zea mays pep gene 
for (C3 type) 
phosphoenolpyruvate 
carboxylase 



729 I AE0121Q6 



730 I AF027174 



731 1 Y08682 



732 I U95Q94 



733 I AF064030 



Drosophila 
melanogaster 
Windbeutel (wind) 
gene, complete cds 



Brassica rapa mRNA 
for SRK45, complete 
cds 

Arabidopsis thaliana 
cellulose synthase 
catalytic subunk (Ath 
B) mRNA, complete 
cds 



ACCESSION 



0.061 



0.061 



0.060 



H.sapiens mRNA for 
carnitine 

palmitoyltransferase I 
type I 



Xenopus laevis XL 
INCENP (XL- 
INCENP) mRNA. 
complete cds 



0.060 



0.060 



0.060 



Helianthus tuberosus 
lectin 2 mRNA, 
complete cds 



734 I API 00694 



735 I U95102 



, 737 1 U69663 



Mus musculus 
Pontin52 mRNA, 
complete cds 



0.060 



0.060 



Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 



Sambucus nigra lectin 
precursor mRNA. 

complete cds 0.060 

Human nuclear pore 
complex-assoc iated 
protein TPR I 0.060 



0.060 



_ DESCRIPTION 
HYPOTHETICAL 32.0 KD 



2496887 



3702295 



<NONE> 



<NONE> 



3319446 



1041119 



632209 



3098348 



125978 



2055394 



4127854 



PROTEIN C09F5.2 IN 
CHROMOSOME III 
>gi|732538 (U22832) C09F5.2 
gene product [Caenorhabditis 
elegansl 



(AC005783) R33083.1 [Homo 
sapiens] 



<NONE> 



<NONE> 



(.AF077541) contains similarity 
to class-I aminoacyl-tRNA 
synthetases [Caenorhabditis 
elegans] 



P VALUE! 



le-15 



2e-60 



(D78016) TRAE [Enterococcus 
faecal is] 



<NONE> I 



8.1 



regulatory protein Rex - primate 
T-lymphotropic virus PTLV-L 
(fragment) 



(AF037401) neuropeptide 
Y/peptide YY receptor Yc 
[Danio reriol 

TaR PRUTETO PRECURSOR 
(LEUKOCYTE ANTIGEN 
RELATED) 
■gi|70146|pir||TDHULK 
leukocyte antigen-related 
protein precursor - human 
>gi|34267 sapiens] 



8.1 



3.7 



2.1 



(U87306) transmembrane 
receptor UNC5H2 [Rattus 
norvegicus] 



7JM 



fY 14063) ChTl thymocyte 
antigen [Gallus gallus] 



1.2 



o.: 



9e-04 



WO 01/02568 



PCT/US00/18374 





jy Neares 


: Neighbor (BlastN vs. 


Genbank) 


Nearest Neigr 


lbor (BlascX vs. Non-Redundant Proteins) j 


e> Iaccessioi 


V DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


|p value) 


738 


1 ABO 1455 3 


Homo sapiens.mRNi 
for KIAA0653 
protein, partial cds 


\ 

0.060 


1 1326350 


(U58748) similar to potential 

tran*»memhrnnf» rlnmninc «n c 

cerevisiae nulcear division 
RFT1 protein (SP:P382061 


le-09 1 


739 


1 U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.060 


I 2632098 


(Y15513)Prodos protein 
f Drosophila melanogaster] 


5c- 10 j 


740 


I Z96260 


H.sapiens telomeric 
DNA sequence, clone 
12QTEL 101, read 
12QTELOO10l.seq 


0.059 


I <NONE> 




<NONE> 1 


741 


M93128 


Mouse homeobox 
protein (EVX2) 
mRNA, complete cds. 


0.059 


1 <NONE> 




<NONE> | 


742 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.059 


1652318 


(D90904) lysostaphin 
rSynechocvstis sp.] 


4-7 j 


743 


AB007920 


Homo sapiens mRNA 
for KIAA045 1 
protein, complete cds 


0.059 


47949 1 


transcription factor brn-3b - 
human 


071 


744 1 


M60445 


Human histidine 
decarboxylase (HDC) 
mRNA, complete cds 


0.058 


<NONE> 


<NONE> 


<NONE> 1 


745 


1 

t 
e 

U01836 c 


Jstilago maydis 
ixodeoxyribonucleas 
i(RECl) gene, 
'omplete cds. 


0.058 


( 

1 
\ 

F 

1171908 f 


ULlGUPhFlIDE 
TRANSPORT SYSTEM 
PERMEASE PROTEIN OPPC 
>gi|1075086|pir||D64184 
Dligopeptide transport system 
permease orotein fnnnf^f 
lomolog - Haemophilus 
nrluenzae (strain Rd KW20) 
>ermease protein (oppC) 
Haemophilus influenzae Rd] 


1.5 1 


746 1 


L 

e 
c 
s 

(1 

AF090115 c 


-ycopersicon 
sculentum cytosolic 
lass II small heat 
hock protein HCT2 
HSP17.4) mRNA. 
omplete cds 


0.058 j 


L 

3193265 s 


AF069131) chitinase [Bacillus 
jbtilis] 


0.002 


747 1 


B 

fc 

AB012105 c< 


rassica rapa mRNA 
)r SLG45, complete 
is 


0.057 | 


a 

433385 is 


J03978) dynein heavy chain 
otype 7A [Tripneustes gratilla] 


3.4 1 



OJUu 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Rarest Neighbor 5E5 vs. Oenbank) I Nearest N^,,^ j gjjj g j Non-Redundan, 



ACCESSION 1 DESCRIPTION 



JArabidopsis thaliana" 



ImRNA for 
a Ineoxanthin cleavage 

748 AJ005813 lenzvme 



P VALUE | ACCESSION 



749 | Y 16828 



Lagopus lagopus 
Igenomic 
Imicrosatellite 
[seq uence, LLST4 



ISambucus nigra 
jribosome inactivating 
^ protein precursor 

750 J AF0 I2899 ImRNA. complete cds 

ISambucus nigra 
, jhevein-Iike protein 

7511 AFQ7 4385 mRNA, complete cds 



752 



753 



Sambucus nigra lectin 
[precursor raRNA, 
U76523 complete cds 

Human retrovirus- like 
M92Q69 sequence-isoleucine c 



0.056 



0.056 



0.055 



0.055 



754 



|GlL=ankyrin-like 
repeat [orf virus OV 
NZ2. Genomic. 1608 
S78516 Int] 



Chicken myosin 
Jalkali light chain 
I raRNA, complete cds 
755 I M15646 clone p Fl. 



756 | AF027174 



[Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA. complete 
cds 



0.035 



0.034 



0.03 



0.027 



0.025 



<NONE> 



3328678 



<NONE> 



137339 



DESCRIPTION 



|P VALUE! 



<NQNE> 
<NONE> 



2804465 



<NONE> |<NQ NE>1 

(AE001299) hypothetical 
protein [Chlamydia trachomatis] 1 4.3 



<NONE > 
69 KD PROTEIN 

>gi|94375|pir||S 19150 
hypothetical protein, 69K - 
turnip yellow mosaic virus | 0.69 



<NONE> 
<NONE> 



(AF043700) contains similarity 
to human RNA-binding protein 
FUS/TLS (SW:Q2S009) 
Caenorhabditis elegansl 



<NONE> 
I <NONE>| 



0.15 



3334221 



3877815 



HYDROXYPHENYLPYRUVA 
TED [OXYGENASE 4- 
hydroxyphenylpyruvate 
dioxygenase [Mycosphaerella 
graminicolal 



(29604S) predicted using 
Genefinder 



6e-17 



5.0 



WO 01/02568 



PCT/US00/18374 



m$m Nearest Neighbor (BlastN vs. Genbanlp 



SEQ 

g I accession! description 



757 I AJ002291 



758 I X79I04 



759 I U95102 



jStreptococcus 
(pneumoniae pbplb 
?ene, complete 



C.botulinum (NCTC 
7272 type A) HA-33 
andP-21 genes , 

iXenopus laevis 
mitotic 

Iphosphoprotein 90 
ImRNA. complete cds 



P VALUE 



Nearest Nei F hbpr (BlastX vs. Non-Redundant Protein.^ 



ACCESSION 



DESCRIPTION 



0.025 



0.024 



0.024 



*■) similar to nbose" 
(phosphate pyrophosphokinase; 
JcDNA EST EMBL:D73 173 
Jcomes from this gene; cDNA 
EST EMBL:D70909 comes 
(from this gene; cDNA EST 
EMBL:D73449 comes from this 
gene; cDNA EST 

EMBL:D76167 comes from this 
3880487 (oe^ 

|(AE000970) tungsten 
formylmethanofuran 
dehydrogenase, subunit B (fwdB| 
J26486 1 5 2) f Archaeoglobus fulgidusl 

(DS3785) expressed 
ubiquitously; product similar to 
p.melanogaster mam protein. 
1663698 [Homo sapiens! 



760 



U36197 



761 



L38865 



762 | AF035948 



Chlamydomonas 
reinhardtii cobalamin 
independent 
methionine synthase 
mRNA. complete cds 



Macaca mulatta 
(clone MMVA63) T- 
cell receptor alpha 
(TCR A) mRNA. 
•artial cds. 



Mus musculus insulin 
receptor substrate-3 



763 



X98890 



S. tuberosum mRNA 
for inorganic 
phosphate 
transporter, StPTl 



0.024 



585723 



0.023 



0.023 



P VALUE! 



0.023 



110072 



1.7 



6.1 



4.7 



PEROXISOME 
PROLIFERATOR 
ACTIVATED RECEPTOR 
GAMMA (PPAR-GAMMA) 
>gi|2S3SlS|p>r||C42214 
peroxisome proliferator- 
activated receptor gamma chain 
African clawed frog >gi|2 14668 
(MS4163) peroxisome 
proliferator activated receptor 
J gamma [Xenopus laevis] 



0.42 



<NONE> I <NONE> 

SPLICEOSOME 
ASSOCIATED PROTEIN 49 
spliceosome-associated protein 
2500587 SAP-49- human >gi|556217 



proline-rich protein MP4 
mouse >'si|53 182 



<NONE> 



0.40 



0.18 



WO 01/02568 



PCT/US00/18374 



SEQ 


|J Neares 


t Neighbor f BlastN vs. Genbankj 


Nearest NeigJ 


lbor (BlastX vs. Non-Redundant I 


*roteins) 


ID 


JACCESSIO 


^ DESCRIPTION 


P VALUI 


s ACCESSION 


DESCRIPTION 


P VALUE 


764 


1 X91212 


L.esculentum mRN^ 
for HP-ZIP protein 


0.022 


1 <NONE> 


<fNL;NE> 


<NONE> 


765 


1 AC0O4498 


Homo sapiens 
chromosome 5, PI 
clone 1209C1 (LBN1 
HI 04), complete 
sequence [Homo 
sapiens] 


0.022 


1 <NONE> 


<NONE> 




766 


U07083 


Humxin nro^mrir* 

■i tuinui i pi UOlullV. uCJU 

phosphatase (ACPP) 
gene, exon 1 


0.022 


1 • <NONE> 




<NONE> J 
<NQNE>| 


767 1 


X98890 


S. tuberosum mRNA 

for moron nir* 

phosphate 
transporter. StPTl 


0.022 


f <NONE> 




<NONE> j 


768 J 


X56488 


L.esculentum I ATS9 
gene 5'flanking 
region, expressed 
during pollen 
maturation 


0.022 


<NONE> 


<NONE> 


<NONE> 


769 1 


M34651 


with upstream and 

downsteam 

sequences. 


0.022 | 


<NONE> 


<NONE> I 


<NONE> J 


770 




X66727 


P taedn cyprir* tr»r 

protochlorophyllide 
reductase 


0.022 1 


3878517 


(Z92S06) K10G4.4 
Caenorhabditis elesans] f 


4.3 j 


771 




i 
I 

U95102 r 


Tiitotic 

^hosphoprotein 90 
tlRNA, complete cds 


0.022 ] 


( 

1 

1854452 s 


D89501) similar to salivary j 
^roline-rich protein P-B [Homo I 
apiens] 1 


4.3 j 


772 




r 
F 

U95098 n 


<enopus laevis 
nitotic 

hosphoprotein 44 
nRNA, partial cds 


0.022 


( 

3021699 s 


AB005298) B AI 2 [Homo 
apiens] ! 


064 1 


773 


Jr 

' f< 
X71932 1 


[.sapiens XB gene 
3r tenascin-X, intron 
4 


0.022 1 


li 
P 

> 

627059 a 


ver stage antigen LSA-1 - j 
Iasmodium falciparum 
gi|9916 (X56203) liver stage 
itigen 1 


0.058 I 


774 1 


C 

X87369 [gt 


.perfringens nanH 
-ne & ORFI.2.3 & 4 


0,022 1 


a 

2062407 [gj 


J7S975) poIy(ADP-ribose) j 
ycohydrolase [Bos taurus] | 


0.056 | 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genba nk) 
SEQ ' ~™*" 
ID \ AC CESSION DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



GaJlus gallus mRNA 
?75 1 Y14971 [for K60 protein 

[Caenorhabditis 
elegans cos mid 
776 | AF003133 T21E3 



DESCRIPTION 
Ul SMALL NlitLbAK 



777 I . AF003133 



778 I U57645 



Caenorhabditis 
jeJegans cosmid 
I T21E3 

-fuman helix- loop- 
Ihelix proteins Td-I 
(ID- 1) and Id-l'(ID- 
1) genes, complete 
cds 



779 



U67570 



|Methanococcus 
jannaschii section 1 12 
lof 150 of the 
[complete genome 
(Trypanosoma cruzi 



780 I L01584 



calcium-binding 
protein (CUB2.8) 
[gene, complete cds. 



Borreiia hermsii outer 
781 I LQ4787 Imembrane lipoprotein 



782 | U95094 



IXenopus laevis XL- 
INCENP(XL- 
INCENP) mRNA, 
complete cds 



0.022 



0.022 



134091 



1709997 



klBONUCLEoPkdTKlN 70 

KD(U1 SNRNP70KD) 
>gi|85864|pir||S020l6Ul 
snRNP 70K protein - African 
clawed frog >gi|65179 
(X12430) Ul 70K [Xenopus 

laevis] 

UNA REPAIR PROTEIN 
RAD18 >gi|l 150622 protein 
radl8 [Schizosaccharomyces 
pombe] 



P VALUE 



0.032 



2e-08 



0.022 



0.02 1 



1709997 



<NONE> 



DNA REPAIR PROTEIN 
RADl8>gi| 1 150622 protein 
rad!8 [Schizosaccharomyces 
JornbeL | 2e-08 



<NONE> I <NONE> 



0.021 



<NONE> 



<NONE> 



1 <NONE> 



0.021 



<NONE> 



0.021 



<NONE> 



783 | L36890 



Saccharomyces 

cerevisiae 

mitochondrion 

transfer RNA-Thrl 
j(tRNA-Thr) gene; 

transfer RNA-Val 
l(tRNA-Val) gene; 
oxi2 gene, complete 
cds; ORF2 and origin 
|of replication (ori5). 



0.021 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



I <NONE> 



<NONE> 



<NONE> 



0.021 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 





££l Neares 

R 

) lACCESSIO 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neig 

i ACCESSION 


hbor (BlastX vs. Non-Redundant 
DESCRIPTION 


Proteins) 
P VALUE 


78^ 


1 1 M76741 


Homo sapiens biliar 
glycoprotein (BGP) 
_ gene, partial cds. 


V 

0.021 


_ <NONE> 


<NONE> 


<NONE> 


785 


1 M87504 


Tetrahymena 
thermophila histone 
H3 (HHT2) gene, 
complete cds 


0.021 


<NONE> 


<NONE> 


<NONE> 


786 


1 U94346 


Human calpain-Iike 
protease (htra-3) 
mRNA. complete cds 
Trypanosoma cruzi 
calcium-binding 
protein (CUB2.S) 
gene, complete cds. 


0.021 


• ' <NONE> 


<NONE> 


_ <NONE>| 


787 


1 L01584 


0.021 


<NONE> 


<NONE> 


<NONE> 1 


788 


U36530 


rongo pygmaeus L r 
microsatellite. clone 
#1, from the tandemly 
repeated genes 
encoding U2 small 
nuclear RNA (RNU2 
locus) 


0.021 


<NONE> 


<NONE> 


<NONE> 


789 


X03833 


Human gene tor 
interleukin 1 alpha 
ttL-I alpha) 


0.021 


416974 


EARLY TRANSCRIPTION 
FACTOR 70 KD SUB UNIT 


8.9 I 


790 1 


1 

U20806 c 


Dictyostelium 
discoideum guanine 
nucleotide-binding 
protein alpha subunit 
5 (G alpha 5) gene, 
:omplete cds. 


0.021 


( 
I 

1401211 ( 


U58510)RNA helicase 
lomolog [Chlorarachnion 
:CMP62il 


8.8 1 


791 I 


I 

c 

Z59258 r 

2 


I.sapiens CpG DNA. 
lone 13d2, reverse 
ead cpgl3d2.rtlc . 


0.021 


t 
( 
( 
( 

3121732 f 


\C UN IT ATE HYDRATASE " 
CITRATE HYDRO-LYASE) 
ACONITASE) >gi |2183256 
AF002133) aconitase 
Mycobacterium aviuml 


7.0 


792 1 


u 
C 
rt 
P 

AF030692 c( 


lasmodium 
liciparum strain 7GS 
hloroquine 
distance candidate 
rotein (cg2) gene. 
)mplete cds 


0.021 


> 
h 

S 

Oi 

3024190 b; 


UNk PROTEIN 

gi|2 12025 l|pir||S66581 

ypothetical protein 56 - phage 

2>gi|1051114(X92588) 

■f56; related to nin60 (niaE) of 

icteriophase lambda 


5.3 I 


793 1 


^ 

of 

U67570 [cc 


lethanococcus 
nnaschii section 1 12 
* 150 of the 
>mplete genome 


0.021 


2341037 \A 


^C000104) F19P19.17 
Lfabidopsis thaliana] 


4.0 1 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 
I SEQ ~ ^ ~ 

ro I ACCESS ION DESCRIPTION 



798 



794 I D86566 



796 I U95Q94 



797 I U30938 



P VALUE 



Nearest Neighbor fBlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



Human DNA for 
NOTCH4, partial cds 
Streptomyces 



coelicolor sigma 
factor (rpoX) gene, 
complete cds. 



Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA. 
complete cds 



0.021 



1708619 



0.021 



Rattus norvegicus 
microtubule- 
associated protein 2 



Chicken mRNA for 
TSC-22 variant, 
complete cds, clone 
SLFEST52 



0.021 



0.021 



0.021 



800 I X71932 



Gallus gallus eHAND 
mRNA, complete cds | 0,02 i 



801 1 AF042333 



802 



L37380 



803 I AF0Q3133 



H.sapiens XB gene 
for tenascin-X, intron 
14 



Oryza sativa 24- 
methylene lophenol 
C24(l)methy!transfer 
ase mRNA, complete 
cds 



Rat apical endosomal 
glycoprotein mRNA, 
complete cds. 



0.021 



0.021 



79833 



0.021 



Caenorhabditis 
elegans cosmid 
T21E3 



0.021 



NUCLEAR h'ACTOft NF- 
KAPPA-B P100 SUB UNIT 



(H2TF1) (ONCOGENE LYT- 
10) (LYTI0) [CONTAINS: 
NUCLEAR FACTOR NF~ 
KAPPA-B P52 SUBUNIT1 



128000 



hypothetical 119.5K protein 
(uvrA region) - Micrococcus 
luteus 

' ^N 

CONVERTASE 1 
PRECURSOR (NEC 1) (PCI) 
(PROHORMONE 
CONVERTASE 1) propeptide 
processing protease [Mus 
cookii] 



468600 



693723 
3449308 



(X74416) beta- 3 integrin 
fTakifugu rubripes] 



627059 



854065 



liver stage antigen LSA-1 - 
Plasmodium falciparum 
>gi|9916 (X56203) liver stage 
antigen 



(X83413)U88 [Human 
herpesvirus 6] 



3334377 



1709997 



TRANSMEMBRANE 
PROTEASE. SERINE 2 



DNA REPAIR PROTEIN 
RAD 18 >gi|l 150622 protein 
radlS [Schizosaccharomyces 
pombe] 



IP VALUE 



1.3 



1.0 



1.0 



27 kda amelogenin 

{alternatively spliced} | 0.61 

(ABO 1 1541) MEGF8 [Homo 

sapiens] | Q2\ 



0.054 



0.014 



le-05 



3e-0S 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


1 Nearesi 

2 

J ACCESSIOt 


Neighbor (BlastN vs. 

NT DESCRIPTION 
Rabbit mRNA for 


Genbank) 
P VALUE 


Nearest Neiel 


ibor (BlastX vs. Non-Redundant F 
UcoLKir 1 ION 


Voteins) 
P VALUE 


804 


I X57689 


calcium channel Bl-2 
(lambda CBP109 anc 
CB101) 


> 
i 

0.021 


1 2959370 


(AL022117) hypothetical 
protein 


le-10 


805 


I U95102 


Xenopus Iaevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 


0.021 


1109830 


(U41534) coded tor by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegansl 


5e-ll I 


806 


I X77753 


H.sapiens TROP-2 
gene 


0.021 i 


1723657 


PROTEIN IN ERV1-GLS2 . 
INTER GENIC REGION 
>gi|2132587|pir||S64322 
probable membrane protein 
YGR031w - yeast 
(Saccharomyces cerevisiae) 
>gi|1323010|gni|PID|e243277 
(272816) ORF YGR031w 
fSaccharomvces cerevisiae] 


5e-ll 


807 


! X98890 


S. tuberosum mRNA 
for inorganic 
phosphate 




2137872 


zinc finger protein PZF - mouse 
>gi|453376 


2c- 19 


808 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A} mRNA rnmnipfp^ 
cds 


0.020 I 


<NONE> 


<NONE> 


<NONE> I 


809 


AJ224935 


Homo sapiens 
Promotor Region and 
PCK2 2 ene 


0.020 


<NONE> 


<NONE> 




810 J 


< 
i 
F 

U76524 r 


Sambucus nigra 
"ibosome inactivating 
?rotein precursor 
nJRJMA, complete cds 


0.020 1 


<NONE> 


<NONE> 


<NONE>| 

<NONE> 


811 I 


/ 

X99941 c 


V.thaiiana GBF1 
:ene 


0.020 


<NONE> 


<NONE> 


<NONE> I 


812 1 


» 
f 

S 

H 

X65138 rr 


/I.musculus mRNA 
or tyrosine kinase > 
gb|S57168|S5716S 
ek=Eph-related 
?ceptor protein 
/rosine kinase [mice. 
iRNA, 4242 nt] 


0.020 1 


<NONE> 


<NONE> 


;NONE> 1 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


1 Nearesi 

) 

ACCESSIOr 


Neighbor (BlastN vs. 
< DESCRIPTION 


Gen bank) 
P VALUE 


I Nearest Neish 
1 ACCESSION 


bor (BiastX vs. Non-Redundant Proteins) 

DESCRIPTION | P VALUE 


813 


I L04787 


Borrelia hermsii oute 
membrane lipoprotei 


r 

n 0.020 


I <NONE> 


<NONE> 


j <NONE> 


814 


J AJ223633 


Enterococcus taeciun 
genes encoding 
enterocin L50A and 
enterocin L50B plus 
5' and 3' flanking 
regions 


i 

0.020 


<NONE> 


<NONE> 


<NONE> 


815 


1 AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.020 


1 <NONE> 


<NONE> 


<NONE> 


816 


AE001539 


Helicobacter pylori, 
strain J99 section 100 
of 132 of the 
complete genome 


0.020 


| 172292 


(LI 1895) transmembrane 
protein [Saccharomyces 
cerevisiae] 


8.4 


817 


AF074386 


Sambucus nigra 
hevein-Iike protein 
mRNA, complete cds 


1 0.020 


94173 


pol polyprotein - Chinese 
hamster intracistemal A-particIe 
CHIAP34 


! 8.0 


818 


M55264 


Herpesvirus saimiri 
dihydrofolate 
reductase (DHFR) 
and snRNA (HSUR) 
genes, complete cds. 


0.020 


2924250 


(Z98745) dJ29KL2 [Homo 
sapiens] 


6.5 


819 


AF052163 f 


Homo sapiens clone 
24456 mRNA 
sequence | 


0.020 I 


< 

1706288 i 


U(4) UUKAMLNh KhLhP'lUK 
RECEPTOR) 

>gi|21 19482|pir||I49246 D4 
dopamine receptor - mouse 1 
>gi|753427 (U19880) D4 
dopamine receptor [Mus j 
musculus] [ 
> a ill095539lDrf1l^ 10Q^5QA 
dopamine D4 receptor [Mus 
nusculusl 1 


4.9 


820 


< 
I 

AF074387 r 


Sambucus nigra 
levein-Iike protein 
nRNA, complete cds 


0.020 


( 

2113798 r 


ZS3259) AmphiBrf38 
Branchiostorna floridae] | 


4.7 


821 1 


I 
2 

AF052163 s 


lomo sapiens clone 
4456 mRNA 
equence | 


0.020 1 


( 
E 

o 

E 

o 

i 

3874733 s 


Lbih4) cDNA EST 
;MBL:T02354 comes from this 
ene; cDNA EST 1 
IMBL:D32698 comes from this 
ene; cDNA EST 
,MBL:D35411 comes from this 
ene | 


4.7 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


- Neares 
ACCESSIOl 


t Neighbor (BlastN vs. 
\ DESCRIPTION 


Genbank) 
P VALUE 


J Nearest Neisr 
1 ACCESSION 


ibor (BlastX vs. Non-Redundant E 
DESCRIPTION 


>roteins) 1 
P VALUE 


822 


Rat ankyrin binding 
glycoprotein- 1 relate 
U1002 mRNA sequence. 


d 

0.020 


552132 


(KOI 664) Bkm-Iike protein 
[Drosophila melanogaster] 


3.8 I 


823 


(Helicobacter pylori, 
strain J99 section 10C 
of 132 of the 
AE001539 complete genome 


) 

0.020 


I 172292 


(L 1 1 895) transmembrane 
protein [Saccharomyces 
cerevisiae] 


3.8 1 


824 


X98890 


S.tuberosum mRNA 
for inorganic ■ 
phosphate 
'transporter, StPTl 


0.020 


3879798 


Domain (2 domains); cDNA 
EST yk390bl0.3 comes from 
this gene; cDNA EST 
EMBL:D71652 comes from this 
gene; cDNA ESTyk275f8.3 
comes from this gene; cDNA 
EST yk393b9 3 comes frnrn fhi<: 
gene; cDNA ESTyk37... 
>gi|3880220|gnl|PID|el349842 
yk390bl0.3 comes from this 
gene; cDNA EST 
EMBL:D71652 comes from this 
gene; cDNA EST yk275f8.3 
comes from this gene; cDNA 
EST yk393b9.3 comes from this 
gene; cDNA EST vk37... 


1.3 


825 


U97519 


Homo sapiens 
podocalyxin-Iike 
protein mRNA, 
complete cds 


0.020 


< 
< 

1345633 r 


L-l- 1 h 1 UAH Y UiiUb UL ATE~~ 
SYNTHASE, CYTOPLASMIC 
(Cl-THF SYNTHASE) 
( METHYLENE TETRAHYDR 
OFOT ATF 

DEHYDROGENASE / 

METHENYLTETRAHYDROF 
DLATE CYCLOHYDROLASE 
2 1 -tetrahydro folate synthase 
Rattus norvesicus] 


0.066 1 


oZo 


< 

AF003133 ' 


CTaenorhabditis 
ilegans cosmid 
r21E3 


0.020 


1 

I 
r 

1709997 F 


3NA REPAIR PROTEIN 
IAD 18 >gi| 1150622 protein 
ad 1 8 [Schizosaccharomyces 
)Ombe) 


2e-07 j 


827 


< 
c 

c 
c 

r 
P 

U32857 s 


Saccharomyces 
-erevisiae VARl 
tene, mitochondrial 
:ene encoding 
nitochondrial 
rotein, 3' processing 
ire. partial sequence 


0.019 f 


<NONE> 


<NONE> 


cNONE> 1 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ro I ACCESSION 



DESCRIPTION 



828 I AFQ27174 



829 \ AFQ34Q99 



830 I AP1QQ694 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA. complete 
cds 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant ProtemTT 



ACCESSION 



Laccaria bicolor 
glyoxal malate 
synthase protein 
mRNA, co mplete cds 



0.019 



0.019 



Mus musculus 
Pontin52 mRNA, 
complete cds 



831 



U24578 



832 I U76523 



833 | U57649 



Human RP L and 
complement C4B 
precursor (C4B) 
genes, partial cds. 



Sambucus nigra lectin 
precursor mRNA, 
complete cds 



0.018 



DESCRIPTION 



2506381 



NbUkOULNKJ LOCUS 
NOICH HOMOLOG 



P VALUE I 



3880930 



<NONE> 



0.013 



Uibenzoluran- 

degrading bacterium 

DPO360 2,3- 

dihydroxybiphenyl 

1,2-dioxygenase 
(bphC) gene, 
complete cds and 2- 
hydroxy-6-oxo-6- 
phenyIhexa-2.4- 
dienoic acid 
hydrolase 



834 1 X 15642 



835 



X51623 



Z.mays gene for 
phosphoenolpvruvate 
carboxylase 



0.011 



0.011 



C.elegans collagen 
gene col- 13 



0.011 



0.010 



478673 



PROTEIN 4 PRECURSOR 
(TRANSFORMING PROTEIN 
INT-3) mammary gene mRNA, 
complete cds.], gene product 
[Mus musculus] 

similar 

Phosphoglucomutase and 
phosphomajinomutase 
phosphoserine; cDNA EST 
EMBL:D36168 comes from this 
gene; cDNA EST 
EMBL:D70697 comes from this 
gene; cDNA EST yk373h9.5 
comes from this gene; cDNA 
EST EMBL:TQ08... 



3.3 



<NONE> 



<NONE> 



proline-rich protein precursor - 
kidney bean vulgaris] 



<NONE> 



<NONE> 



<NONE> 



1695686 



<NONE> 



<NONE> 



(D83706) pyruvate carboxylase 
[Bacillus stearothermophilusl 



6e-15 



<NONE> 



3.1 



<NONE> 



<NONE> 



<NONE> 



836 



U83656 



Rattus norvegicus NF 
KB gene, promoter 
region 



0.008 



4240195 



(AB020660) KIAA0853 protein 
[Homo sapiens] 



13 



WO 01/02568 



PCT/US00/18374 



SEQ 

ED I ACCESSION 



Nearest Neighbor (BlastN vs. Gcnbank) | ^ Neiphh , R|astX v> , Non , Redlm ^ nf 



837 1 AJ222657 



338 1 ABO 12 106 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



Homo sapiens gene 
encoding retina- 
specific guanylyl 
cyclase 



Brassica rapa mRNA 
for SRK45, complete 
cds 



0.008 



417704 



839 



U76524 



Sambucus nigra 

bosome inactivating 
protein precursor 
mRNA. complete cds 



0.008 



544024 



840 I AF0 12899 



841 | AF074385 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 



0.008 



532468 



POL POLY PROTEIN" 



(ORFTaTTI) [CONTAINS: 

RNA-DIRECTED RNA 
POLYMERASE ; HELICASE; 
PROTEASE 



1 ] 

LHLURUJk LHA1NNLL 
PROTEIN, SKELETAL 
MUSCLE (CHLORIDE 
CHANNEL PROTEIN 1) (CLC 
1) human >gi|397I43 (Z25587) 
human CIC-1 muscle chloride 
channel [Homo sapiens] 
>gi|398161 (Z25884) human 
CIC- 1 muscle chloride channel 
Homo sapiens] 



P value! 



7.4 



842 



U48734 



843 | U66669 



Sambucus nigra 
hevein-Iike protein 
mRNA. complete cds 



0.008 



Human non-muscle 
alpha-actinin mRNA, 
complete cds 



Homo sapiens 3- 
hydroxyisobutyryl- 
coenzyme A 
hydrolase mRNA, 
complete cds 



o.oos 



o.oos 



0.007 



844 | D 1 6492 



Mouse mRNA for 
PI 00 serine protease 
of Ra-reactive factor 
(RaRF). complete cds 



0.007 



4101160 



(U13643) similar to reverse 
transcriptase; possible 
pseudogene [Caenorhabditis 
elegans] 



(AF002589) cytochrome 
oxidase I [Austrofundulus 
imnaeus] 



1711520 



SRB-3/9 PROTEIN 
>gi| 1334996 



2829922 



<NONE> 



(AC002291)extcnsin 
[Arabidopsis thaiianal 



<NONE> 



<NONE> 



4.6 



3.8 



2.7 



1.6 



0.1 1 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 
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Nearest Neighbor fBlastN vs. Genbank ) 



SEQl 

S> 1 accessionI DESCRIPTION 



345 I D9Q923 



846 | AB01I087 



847 1 AEOOQ688 



848 1 x63723 



849 1 AF074386 



850 



854 



J00097 



851 I D9Q923 



Human 

imm u node fi c i e n c y 
virus type 1 proviraJ 
DNA for envelope 
glycoprotein, partial 
cds. isolate 03S 

Homo sapiens mRNA 
for KIAA0515 
jrotein, partial cds 



Nearest Neighbor (BlastX vs. Non-Redund a nt"p^^7 



P VALUE I ACCESSION 



DESCRIPTION 



0.007 



0.007 



<NONE> 



<NONE> 



Aquifex aeolicus 
section 20 of 109 of 
the complete genome 



B.bovis WC1.1 
mRNA 



Sambucus nigra 
hevein-like protein 
mRNA. complete cds 



Human beta giobin 
region Alu repetitive 
equence type T. 



852 1 U95Q94 



853 | X91618 



uman 
immunodeficiency 
virus type 1 pro viral 
DNA for envelope 
glycoprotein, partial 
cds, isolate 03S 



0.007 



<NONE> 



|P VALUE! 



0.007 



<NONE> 



0.007 



<NONE> 



0.007 



<NONE> 



Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 



X03838 



.castaneum 
hunchback gene 
Rat nontranscribed 
spacer (NTS) 
downstream of 2SS 
rRNA sene 



0.007 



<NONE> 



0.007 



<NONE> 



855 I M55049 



Rattus norwegicus 
interleukin-2 receptor 
alpha chain (CD25) 
mRNA. complete cds. 



0.007 



0.007 



<NONE> 



<NONE> 



0.007 



<NONE> 



<NONE> 



<NONE> 



1 <NONE>| 



1 <NONE> 



<NONE> 



I <NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



I <NONE> 



<NONE> 



1 <NQNE> 1 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


*T| Neares 

5 

Iaccessio 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
| PVALUI 


Nearest Neia 
i ACCESSION 


ibor (BlastX vs. Non-Redundant ] 
I DESCRIPTION 


proteins) , 
P VALUE 


856 


I Z64318 


H.sapiens CpG DNV 
clone 9e2, reverse 
read cpg9e2.rtla - 


0.007 


<NONE> 


<NONE> 


<NONE>| 


357 


1 AF027I73 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Att 
A) mRNA, complete 
cds 


i- 

0.007 


<NONE> 


<NONE> 


<NONE>| 


858 


AF027174 


rvi auiuupsis inanana 
cellulose synthase 
catalytic subunit (Ath 
D) uiiviN a, complete 
cds 


- 

0.007 


1 <NONE> 


<NONE> 


<NONE> I 


859 




Sambucus nigra 
ribosome inactivating 
protein precursor 
mKNA, complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> I 


860 1 


^V7Jt / O 


P.falciparum 
complete gene map of 
jMastia-hke UNA 


0.007 


<NONE> 


<NONE> 


<NONE>| 


861 1 




Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17. 6 
mKNA. complete cds 


0.007 J 


<NONE> 


<NONE> 


<NONE>| 


862 1 


] 


vlus musculus 
3 ontin52 mRNA, 
;omplete cds 


0.007 I 


<NONE> 


<NONE> 


<NONE> j 


863 1 


I 

r 

V 
X 

AB 000383 c 


-eucania seperata 
mclear polyhedrosis 
irus DNA for pi 3. 
e, envelope protein, 
omplete cds 


0.007 


<NONE> 


<NONE> 


<rNONE> 


864 


I- 

D86566 N 


[uman DNA for 
FOTCH4. partial cds 


0.007 


<NONE> 


<NONE> 


cNONE> 


865 1 


S 
ri 
Pi 

U76524 m 


ambucus nigra 
bosome inactivating 
otein precursor 
RNA, complete cds 


0.007 1 


<NONE> 


<NONE> < 


:NONE> 



A36 



WO 01/02568 



PCT/US00/18374 



f - 

SE< 
ID 


Neares 

3 

lACCESSIO 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
P VALUE 


Nearest Nei si 
J ACCESSION 


ibor (BlastX vs. Non-Redundant I 
1 DESCRIPTION 


^oteins) 

p value! 


866 


I AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ati 
A) mRNA, complete 
cds 


0.007 


1 3047072 


(AF058825) No definition line 
found [Arabidopsis thaliana] 


8.9 1 


867 


j AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) rnRNA, complete 
cds 


0.007 . 


975754 


(U29359) SpaO [Salmonella 
enterica] 


8.6 J 


868 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007 


1213557 


(U50199) coded for by C. 
elegans cDNA ykS9e9.5; coded 
for by C. elegans cDNA cm7g5; 
coded for by C. elegans cDNA 
cm!4b9; coded for by C. 
elegans cDNA yk52g5.5; coded 
for by C. elegans cDNA 
yk76e5.5; coded for by C. 
elegans cDNA yk!31fl 1.5; c... 


8.4 1 


869 1 


I 
f 

ABQ12106 c 


Jrassica rana mRi\TA 
or SRK45, complete 
ds 


0.007 j 


i 
I 
i 
r 

2499568 r 


ISOASPARTATE(D- 
ASPARTATE) O- 
METHYLTRANSFERASE 
l r ^u i tufN-t>b I A- 
AS PART ATE 

METHYLTRANSFERASE) 
;PIMT) (PROTEIN L- 
ISOASPARTYL/D- 
\SPARTYL 

VIETHYLTRANSFERASE) 
nethyltransf erase [Drosophila 
nelanogaster] >gi|l 171337 
nelano^aster] 


8.3 j 


870 1 


i 

h 

AF093268 c 


tattus norvegicus 
omer-Ic mRNA, 
omplete cds 


0.007 


( 

4092077 n 


AF095353) toll-like receptor 4 
lutant [Mus musculus] 


6.2 j 


871 1 


S 
h 

AF074386 rr 


ambucus nigra 
evein-like protein 
lRNA, complete cds 


0.007 


(I 

151377 fl 


V180653) tetraheme 
3 seudomonas stutzeri] 


6.2 


872 1 


B 

L42319 S 


os taurus (clone 
il3.8) tristetraprolin 


0.007 


T 

T 

2507337 R 


RANSCRIPTION 
ERMINATION FACTOR 
HO 


55 1 



WO 01/02568 
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sec; 

ID 


- Nearest 

) 

ACCESSIOr 


Neighbor CBlastN vs. 
vT DESCRIPTION 


Gen bank) 
P VALUE 


1 Nearest Neiah 
I ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


toteins) 
P VALUE 


873 


I M59815 


Human complement 
component C4A 
gene, exons L0 
through 41. 


0.007 


f 3876769 


U.by<>J/) Similarity to Human 
Prolyl 4-hydroxylase alpha 
subunit (SW:P4HA_HUMAN); 
cDNA EST yk219gI2.5 comes 
from this gene; cDNA EST 
yk3 19d8.5 comes from this 
gene; cDNA EST yk339dll.5 
comes from this gene; cDNA 
ESTyk371c9.3... 


5.3 


874 


I X63723 


Xj.DOVIS wl 1.1 

mRNA 


0.007 


f 2969893 


(AJ001858) human SEVI2 
Homo sapiens] 


5.3 


875 


AB009864 


Expression vector 
pME18S-FL3, 
complete sequence 


0.007 


2137618 


p45 NF-E2 related factor 2 - 
mouse musculus] 


5.1 


876 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.007 


1 2804497 


(AF043705) contains similarity 
to C2H2-type zinc fingers 


5.0 


877 


U95102 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 


0.007 


440298 


(L27469) product of alternative 
splicing [Drosophila 
melanoaaster] 


4.7 


878 


X58869 


cnicKen mKiNA tor 

aldehyde 

dehydrogenase 


0.007 


1185062 


(L75945) flagellar export 
protein [Borrelia burgdorferi] 


4.1 


879 1 


AF027735 


Nephila clavipes 
minor ampullate silk 
protein MiSpl 
mRNA, partial cds 


0.007 I 


2394390 


(AF017434) pmi-likegene 
product [Methylobacterium 
extorquens] 


4.0 


880 1 


1 

AF 1 05 228 i 


Bos taurus tuftelin 
uRNA, complete cds 


0.007 


3036S02 ( 


AL022373) putative protein 


3.9 


881 1 


P 

AF 100694 c 


'lus musculus 
ontin52 mRNA. 
omplete cds ! 


0.007 j 


1 
I 

( 

I 

c 
c 
E 
f 

y 

2500SI4 o 


4 i PU 1 HE 1 1L AL bU.2 KJJ 
PROTEIN T27F2.1 IN 

:hromosome v 

>gi|3880311|gnl|P[D|e 1349855 
5X42 (SW:BX42_DROME); 
DNA EST EMBL:C07233 
omes from this gene; cDNA 
:ST EMBL:C08532 comes 
rom this gene; cDNA EST 
k501hl0.3 comes from this 
ene; cDNA EST yk501fl.3... 


3.8 



WO 01/02568 



PCT/US00/18374 



Neares 

seqI 

m Iaccesstoi 


(.Neighbor (BlastN vs. 
V DESCRIPTION 


Genbank^ 
P VALUE 


1 Nearest Neigj 
: 1 ACCESSION 


lbor (BlastX vs. Non-Redundant I 
DESCRIPTION 


^roteins) 
P VALUE 


882 1 X93567 


L. major mRNA for 
beta-tubulin f 1404bp 


) 0.007 


1 2317862 


(U78289) ty lactone synthase 
modules 4 & 5 [Streptomyces 
fradiae] 


3.0 


883 1 AB012106 


Brassica rapa mRNA 
ior oKK4j, complete 
cds 


0.007. 


I 3881103 


(AL032646) predicted using 
Genefmder; cDNA EST 
EMBL:D76407 comes from this 
gene; cDNA EST 
EMBL:C08999 comes from this 
gene; cDNA EST ykl99bl2.5 
comes from this gene; cDNA 
EST yk282a4.5 comes from this 
gene; cDNA EST EMBL;C0... 


2.7 I 


884 1 AF041056 


Homo sapiens 
WSCR4 gene, e.xons 
3 and 4 


0.007 


J 135817 


THROMBIN RECEPTOR 
PRECURSOR human 
>gi|339677 (M62424) thrombin 
receptor fHomo sapiens] 


2.2 


885 I AF093268 


Rattus norvegicus 
homer- 1c mRNA, 
complete cds 


0.007 


1723518 


HYPOTHETICAL 32.2 KD 
PROTEIN C22E 12.04 IN 
CHROMOSOME I >gi| 1220279 
(Z 70043) unknown 


2.1 { 


886 J M74798 


Hevea brasiliensis 3- 
hydroxy-3- 
iiicLii Yigiuiaryi- 
coenzyme A 
reductase gene, 3' 
end. 


0.007 


1001282 


(D64003) polyA polymerase 


19 


1 J 
1 c 

887 j Z62997 r 


rt.sapiens C_pO UNA, 
:Ione 76gl I, reverse 
ead cpg76gl l.rtla . 


0.007 1 


< 

1176532 g 


HVPOTHfiTldlAL 111.5 KD 
PROTEIN C34E10.8HN 
CHROMOSOME III 
>gi|500731 (U10402) weakly 
.imilar to protein C kinase 
ubstrate fCaenorhabditis 


1.8 1 


888 


5 
h 

AF074386 n 


ambucus nigra 
evein-Iike protein 
lRNA. complete cds 


0.007 I 


f 

F 
[ 

> 
P 

2498317 r 


'RECIJRSOR npm.n^P 

olyprotein antigen precursor 
Dictyocaulus viviparus] 
gi|1585421|prfI|2124414A 
olyprotein antigen/allergen 
Dictvocaulus viviparus] 


1.2 1 


889 


S 

(5 

d 

L29426 cc 


ynechocystis species 
>train PCC 6803) 
rgA gene, complete 
is. 


0.007 | 


L- 

3882275 \l 


\B01S32O) KIAA0777 protein 
-lomo sapiens] 


1.1 | 



WO 01/02568 



PCT/US00/18374 



SEC 
ED 


§1 Nearesi 
lACCESSIOf 


Neighbor (BlastN vs. 
* DESCRIPTION 


Gcnbank) 
P VALUE 


Nearest Neigh 

ACCESSION 


ibor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION J P VAT TIF 


890 


1 D83329 


Mus musculus DNA 
for prostaglandin D2 
synthase, complete 
cds 


0.007 


1 1001741 


(D64004) hypothetical protein 


1 0.97 


89 i 


J AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.007 


| 1723928 


HYPOTHETICAL 11.6 KD 
PROTEIN IN NUT1-AR02 
INTERGENIC REGION 
PRECURSOR YGL149w - 
yeast fSaccharomyces 


0.94 1 


892 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.007 


121452 


OLUl'KNlN, MlGH 
MOLECULAR WEIGHT 
SUBUNIT 12 PRECURSOR 
>gi|82606|pir||A24266 glutenin 
high molecular weight chain 12 
precursor - wheat >gi[21779 


0.79 { 


893 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA. complete 
cds 


0.007 


927287 


(U30294) ORF2 [Prevotella 
ruminicola] 


0.35 I 


894 


Y11918 


H.sapiens IMAGE 
cDNA clone 268S1 


0.007 


1055188 


(L40061) contains similarity to 
transmembrane domains like 
those found in sugar transporter 
proteins | 


0.26 1 


895 J 


L36827 


Mus Muscuius 
alphaA-crystallin- 
binding protein I 


0.007 


4063019 


(AF0S3061) ABC transporter 
TliF [Pseudomonas fluorescens] 


0.21 


896 1 


L36827 


Mus Musculus 
ilphaA-crystallin- 
?indin2 protein I 


0.007 1 


< 


^ vojuuij uunsporrer 
riiF [Pseudomonas fluorescens] I 


0.20 I 


Qn-7 1 
oy / | 


I 

c 

Z65719 r 


-Lsapiens CpG DNA. 
:Ione 54c 10, reverse 
ead cpg54cI0.rtla . 


0.007 


1097307 f 


-IIC-1 gene [Homo sapiens] 


0.20 I 


898 1 


I- 
1< 

AF064029 c 


lelianthus tuberosum 
Jctin 1 mRNA, 
omplete cds 


0.007 


I 

F 

( 

> 

1174915 p 


JTROPHIN (DYSTROPHIN- 
DELATED PROTEIN 1) 
DRP1) (DRP) 

gi|284488|pir||S28381 utrophin 
rotein) [Homo sapiens] 


0.002 J 


899 | 


h 
c 

AF051730 2 


lus musculus 
athepsin S (CatS) 
ene. exon 6 


0.007 1 


0 

1707017 r. 


J7S721) RNA helicase isolog 
\rabidopsis thaliana] 


0.001 1 
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— 1 Neares 


Neighbor (BlastN vs. 


Gen bank) 


Nearest Neighbor (BlastX vs. Nnn-Q^HnH.nt 


SEC 
ID 


Iaccessioi 


vj DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


I P VALUE 


90C 


» J U62398 


Oryctolagus 
cuniculus 
gp42/basigin/OX- 
47/HT7 mRNA, 
complete cds. 


0.007 


2370494 


(Z98944) hypothetical protein 


2e-04 


901 


I X76341 


M.musculus 
gJutathione reductase 
mRNA. 


0.007 


| 3513303 


(AC005594) R26984_l [Homo 

sapiens] 


8e-07 


902 


J M26215 


Rat (lambda 20B0.5) 
M-type 6- 
phosphofructo-2- 
kinase/fructose-2, 6- 
bisphosphatase 


0.007 


1 3036809 


(AL022373) putative protein 


6e-15 1 


903 


1 AJB007902 


Homo sapiens 
KIAA0442 mRNA, 
partial cds 


0.007 


2662165 


(AB007902) HH0712 cDNA 
clone for KIAA0442 has a 574- 
bp insertion at position 1474 of 
the sequence of KIAA0442. 
(Homo sapiens] 


2e-17 j 


904 


U93364 


Lactococcus lactis 
cremoris plasmid 
pNZ4000 insertion 
sequence IS9S2 
putative transposase 
gene and eps gene 
cluster 

(epsRXABC DEFGHI 
JKL), complete cds 


0.007 


2731377 


(U28739) similar to alcohol j 
dehydrogenase/ribitcl 
dehydrogenase [Caenorhabditis 

eiegans] 


le-31 j 


905 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.006 


<NONE> 


<NONE> 1 


<NONE> 


906 


AF 1 00694 < 


Mus musculus 
Pontin52 mRNA, 
:ompletecds 


0.006 1 


<NONE> 


<NONE> 


<NONE> 


907 


\ 

AF074386 r 


Sambucus nigra 
levein-like protein 
nRNA. complete cds 


0.006 


<NONE> 


<»1N v-/rN_C..> j 


- 7Vr/~\X T"T7 1 

<NONE> | 


908 


J 
c 
c 
E 

AF027174 c 


Vrabidopsis thaliana 
ellulose synthase 
atalytic subunit (Ath- 
i) mRNA, complete 
ds 


0.006 


<NONE> 


<NONE> 1 


<NONE> 1 


909 1 


n 
n 

AJ005813 e 


irabidopsis thaliana 
iRNA for 
eoxanthin cleavage 
nzyme 


0.006 | 


<NONE> 


<NONE> Jj 


cNONE> | 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlasiN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



910 | AF027174 



911 I AFQ93268 



912 J AFQ93268 

913 I AB0121Q6 



914 I AF064Q29 



Arabidopsis thaiiana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



Rattus norvegicus 
homer- 1c mRNA, 
complete cds 



DESCRIPTION 



0.006 



<NONE> 



Rattus norvegicus 
homer- lc mRNA, 

complete cds 

Brass ica rapa mRNA 
for SRK45, complete 
cds 



0.006 



<NONE> 



915 | API 00694 



916 I AFQ93268 



917 1 API 00694 



918 | AF012899 



919 I XS0289 



Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 



0.006 



0.006 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Rattus norvegicus 
homer- lc mRNA, 
complete cds 



0.006 



<NONE> 



0.006 



<NONE> 



0.006 



4049856 



Mus musculus 
Pontin52 mRNA, 
complete cds 



<NONE> 



<NONE> 



0.006 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 



H.sapiens PTPL1 
mRNA for protein 
tyrosine phosphatase 



3880536 



0.006 



3877761 



(AF063866) ORP MSV064 

hypothetical protein 

Meianoplus sanguinipes 

entomopoxvirus] 

{Zziu/l)) predicted using 

Genefinder; similar to Lectin C- 

type domain short and long 

forms (2 domains); cDNA EST 

EMBL:C10633 comes from this 

gene; cDNA EST 

EMBL;C 12424 comes from this 

gene; cDNA EST ykl91e7.3 

comes from this ... 



P VALUE! 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> I 



(ZS1552) F56G4.1 
Caenorhabditis elegans] 
>gi|3S7S6 15|gnI|PID|el348240 

(ZS3118) F56G4.1 



0.006 



116S791 



CATHEPSIN E PRECURSOR 
precursor - rabbit >gi|402729 
(L0S4 18) procathepsin E 



9.6 



7.9 



7.5 



7.4 
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SEC 
ID 


£ Neares 
} 

ACCESSI01 


t Neighbor (BlustN v S . 
V DESCRIPTION 


Genbankj 
P VALUE 


Nearest NeigJ 
: ACCESSION 


ibor (BlastX vs. Non-Redunr^nr T 
DESCRIPTION 


*roteins) 
P VALUE 


920 


J AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cdi 


0.006 


1346371 


DIAc VLLrL VCkROE 

klNA^t!, sets — 

D IACYL GLYCEROL 
KINASE) 

Id iacy I glycerol kinase (EC 
[2.7. 1. 107) beta -rat 90kDa- 
diacyl glycerol kinase [Rattus 


5.5 1 


921 


1 U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA, complete cds 


0.006 


2196567 


(DS858S) lipoprotein 
[[Escherichia coli] 




922 


AFQ74387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.006 


2113798 


IZS3259) AmphiBrf38 
jl Branchiostoma floridael 


4.3 J 

4.3 


923 j 


AB0L21Q6 


Brass ica rapa mRNA 
for SRK45, complete 
cds 


0.006 


1388166 


(U5S2S2) Bowel [Drosophila 
melanogaster] 


43 1 


924 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.006 


2496785 


HYPOTHETICAL 20.1 KD 
PROTEIN Y4YS 


4.2 1 


925 1 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.006 


416592 


A-AOGLU 1'INIM 
ATTACHMENT SUB UNIT 
PRECURSOR 
>gi|10li70|pir||A41258 a- 
agglutinin core protein AGA1 - 
veast (Saccharomyces 
;erevisiae) 


2.7 


926 J 


I 
1 

AF064029 c 


■lelianthus tuberosus 
ectin 1 mRNA, 
omplete cds 


0.006 


I 

1; 
l\ 

416592 c 


VAOoLUl'llsi'lM 

\TT ACHXrPMT *sT 7~RT TMTT 

PRECURSOR ' 
>gi|101170|pir||A41258 a- 
gglutinin core protein AGA1 - 
east (Saccharomyces 
erevisiae) 


2.5 I 


927 J 


> 
n 
n 

AJ005813 e 


^rabidopsis thaliana 
iRNA for 

eoxanthin cleavage 
nzvme 


0.006 


( 

T 

3258584 [h 


U41263) The 3* UTR of this 
ene overlaps the 3' UTR of 
19Dl2.6(confirmed by EST 
its) [Caenorhabditis elesans] 


2.0 J 


928 1 


h 
S 
cl 
g< 

U33949 cl 


luman Down 
yndrome region of 
iromosome 2 1 . 
-nornic sequence, 
one A12HI-1A6. 


0.006 


1 

3850997 |a 


KF067150} beta-hydroxyacyl- 
CP dehydratase precursor 


1.9 1 
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Nearest Neighbor fBlastN vs. Genhanki 



SEQ j 



Nearest Neighbor (BlastX vs. Non-Redundant twin«i 



id I accession! description p value' 



11751 AFQ27173 



1176| YQ9232 



11771 AJQQ5813 



.11731 AF10Q694 



ACCESSION 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 



H.sapiens fertiiin 
alpha pseudogene 



2e-04 



<NONE> 



1179 j AFQ72847 



13Q| AFO 12899 



Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 

enzyme 

Mus musculus 
Pontin52 mRNA t 
complete cds 
Homo sapiens 



2e-04 



<NONE> 



2e-04 



2e-04 



<NONE> 



<NONE> 



DESCRIPTION 



<NONE> 



<NONE> 



putative swelling- 
activated chloride 
channel (CLNS1A) 
gene, intron 6 



2e-04 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



Sarnbucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 



1131 I U76524 



LI 82 I AF027173 



Sarnbucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
cds 



2e-04 



<N0NE> 



2e-04 



<NONE> 



2e-04 



1213557 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



(U50199) coded for by C 
elegans cDNA yk89e9.5; coded 
for by C. elegans cDNA cm7g5; 
coded for by C. elegans cDNA 
cmi4b9; coded for by C. 

iegans cDNA yk52g5.5; coded 
for by C. elegans cDNA 
yk76e5.5; coded for by C. 
elegans cDNA vk!3lfll.5; c... 



<NONE> 



<NONE> 



<NONE> 



S.4 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



_Nearest Neighbor (BiastN vs. Gcnbank) 



ACCESSION 



1183| AF090115 



11 84 | AF012399 



DESCRIPTION 



Nearest Neighbor (BlastX vs. EjEEj^STp^S 



P VALUE | ACCF^TONT 



DESCRIPTION 



Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 



1 185 I AF074336 



j_l 86 I AFQ27I73 



Sambucus nigra 
ribosome inactivating I 
protein precursor 
mRNA, complete cds I 



Sambucus nigra 
hevein-Iike protein 
mRNA, complete cds j 

Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA. complete 
cds 



1 187 I AJ005813 



1188 I AFQ27174 



Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
ds 



11LL1A L UlaLULLrcr 
ILUMAIN RECEPTOR 1 



2e-04 



729008 



2e-04 



2e-04 



2507582 



1085500 



2e-04 



2623967 



2e-04 



2497316 



2e-04 



1001710 



GLYCOS YLATION END 

PRODUCT-SPECIFIC 

RECEPTOR PRECURSOR 

(RECEPTOR FOR 

ADVANCED 

GLYCOS YLATION END 
PRODUCTS) products receptor 
precursor - bovine >gi| 16365 i i 
(M91212) receptor for advanced 
glycosylation end products [Bos 
taurus) ! 



8.3 



PRECURSOR (TYROSINE- 
PROTEIN KINASE CAK) 
(CELL ADHESION KINASE) 
(TYROSINE KINASE DDR) 
(DISCOIDIN RECEPTOR 
TYROSINE KINASE) (TRK E)| 
(PROTEIN-TYROS INE ' 
KINASE RTK 6) sapiens! 

k^o i HE rie^ijjj.i kb 

PROTEIN IN MOLR-BGLX 
INTERGENIC REGION 
>gi| 1788436 (AE000300) 
putative regulator [Escherichia 

f^J | 7.8 

collagen alpha 1(IX) chain • 
mouse musculus] 
>gi|744962|prfl|2015346A 
col Iagen: S UB UNIT=alpha 1 : ISO 
TYPE=LX [Mus musculus] | 7.8 



(Y13942) GTN Reductase 
[Agrobacterium radiobacter] I 7 4 



5.2 



|(D640Q4) hypothetical protein 



3.5 



2//y 



WO 01/02568 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BtastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOrs 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Arabidopsis thaiiana 






(U41263) The 3' UTRofthis 




1189 


AJ005813 


mRNA for 
neoxanthin cleavage 
enzvme 


2e-04 


3258584 


gene overlaps the 3' UTR of 
T19DI2.6(confirmed by EST 
hits) [Caenorhabditis elesans] 


2.1 


1190 


AF027173 


Arabidopsis thaiiana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


2e-04 


2736338 


(AF038623) contains similarity 
to RNA recognition motifs 


0.89 


1191 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA. complete cds 


2e-04 


2196567 


(D88588) lipoprotein 
[Escherichia coli] 


0.69 | 


1192 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA. 
complete cds 


2e-04 


3319874 


(AJ006096) F-spondin 
[Branchiostoma floridaej 


5e-04 


1193 


L26049 


Chlamydomonas 
reinhardtii dynein 
heavy chain alpha 
(ODA11) gene, exons 
2-15, and partial cds. 


2e-04 


3876775 


(Z81077) predicted using 
Genefinder; Similarity to Yeast 
protein 8248 (TR:G587531) 


2e-09 


1194 


AF I 00694 


Mus muse ul us 
Pontin52 mRNA, 
complete cds 


le-04 


<NONE> 


<NONE> 


<NONE> 


1195 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


le-04 


<NONE> 


<NONE> 


<NONE> 


1196 


I 

L34219 


-lomo sapiens 
-etinaldehyde-binding 
Drotein (CRALBP) 
zene, complete cds. 


le-04 


<NONE> 


<NONE> 


<NONE> 


1197 


1 

X51S90 i 


Rhesus monkey 
nterIeukin-3 gene 


le-04 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Plasmodium 










1198 


AE001421 


fa lei par um 
chromosome 2, 
section 58 of 73 of 
the complete 
sequence 


le-04 


<NONE> 


<NONE> 


<NONE> 


1199 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


Ie-04 


<NONE> 


<NONE> 


<NONF> 


1200 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


le-04 


2576287 


(Y 15086) HepC protein 
[Cvlindrotheca fusiformis] 


4.7 


1201 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvrne 


ie-04 


3395673 


(ABO 16623) RWC-3 [Oryza 
sativa] 


0.14 


1202 


AF03S035 


Homo sapiens 
B RCA 1 -associated 
RING domain protein 
(BARD I) gene, 
exons 2 and 3 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1203 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvrne 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1204 


AB012106 


Brass ica rapa mRNA 
for SRK45, complete 
cds 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1205 


U95098 


Xenopus laevis 
mitotic 

Dhosphoprotein 44 
mRNA, partial cds 


9e-05 


<NONE> 


<NONE> I 


<NONE> 


1206 


1 

c 

S 

AF034099 r 


-accaria bicolor 
dyoxal malate 
ynthase protein 
nRNA. complete cds 


9e-05 


J 
I 
1 

\ 
? 

1351553 : 


HVPOTUKTICaL 

LIPOPROTEIN MG348 
PRECURSOR 
>gi|136166S|pir||E64238 
hypothetical protein MG348 - 
vlycoplasma genitalium (SGC3) 
>gi|3844931 


8.8 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 














1207 


D50006 


Human DNA for 
alpha-platelet-derived 
growth factor 
receptor, exon 6-10 


9e-05 


3063639 


(AF056494) NADH 
dehydrogenase subunit 5 
[Panorpa japonica] 


5.1 


1203 


U50423 


Human Down 
Syndrome region of 
chromosome 21, 
clone A41B8-IB7. 


9e-05 


124273 


INHIBIN ALPHA CHAIN 
PRECURSOR bovine 
>gi|163195 (M13273) inhibin A 
subunit [Bos taurus] 


3.0 


1209 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


9e-05 


4007782 


(X72850) 2.4- 
dihydroxybenzoate 
monooxygenase [Sphingomonas 

sp.] 


2.3 


1210 


AC005276 


Homo sapiens clone 
fragment 

UWGC:gap3 from 
7q31.3, complete 
sequence [Homo 
sapiens] 


9e-05 


1492075 


(U60315) MCI32L [Molluscum 
contagiosum virus subtype I] 


1.0 


1211 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-05 


2887423 


(AB007884) KIAA0424 [Homo 
sapiens] 


2e-10 ! 


1212 


X77772 


C.fuscus gamma-M2- 
1 crystallin mRNA. 


9e-05 


2072425 


(U83 1 15) non-lens beta gamma- 
cry s:al I in like protein [Homo 
sapiens! 


7e-25 


1213 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 

cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1214 


L06I78 


Apis mellifera 
ligustica complete 
mitochondrial 
genome 


Se-05 


<NONE> 


<NONE> 


<NONE> 


1215 


ABO 12 106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


<NONE> 


<NONE> ' 


<NONE> 


1216 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1217 


L06I78 


Apis mellifera 
ligustica complete 
mitochondrial 
izenome 


Se-05 


<NONE> 


<NONE> 


<NONE> 


12IS 


ABO 12 106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 



^1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1219 


AF 1 00694 


Pontin52 mRNA, 
complete cds 


Se-05 


<NONE> 


<NONE> 


<NONE> 


1220 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1221 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


1722841 


WNT-11 PROTEIN 
PRECURSOR (XWNT-1 1) 
clawed frog >gi|439l08 
(L23542) maternal protein 


9.9 


1222 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


1205991 


(U35637) nebulin [Homo 
sapiens! 


9.6 


1223 


AF024605 


Homo sapiens serine 
protease-like protease 
Sequence 2 from 
patent US 5736377 


8e-05 


3242783 


(AF055354) respiratory burst 
oxidase protein B 


8.6 


1224 


Y13148 


Rattus norvegicus 
mRNA for PAG608 
gene 


8e-05 


2314243 


(AJE000616) alpha-ketoglutarate 
permease CketP) 


8.1 


1225 


AJ005S13 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


8e-05 


1170586 


KA5 O I fASL- AC 1 1 V A 1 IN (J- 
LIKE PROTEIN IQGAPI 
(PI95) (KIAA0051) 
>gi|627594|pir||A54854 Ras 
GTPase activating-related 
protein - human sapiens] 
>gi|536844 (L33075) ras 
GTPase-activating-like protein 
[Homo sapiens) 


7.8 


1226 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


8e-05 


464239 


NADH- UBIQUINONE 
OXIDOREDUCTASE CHAIN 
4>gi|10S5lS5|pir||S52968 
NADH dehydrogenase chain 4 - 
honeybee mitochondrion 
(SGC4) >gi|552446 (L06178) 
NADH dehydrogenase subunit 4 
"Apis mellifera ligustica] 


3.5 


1227 


AF 100694 


Vlus musculus 
Pontin52 mRNA, 
complete cds 


8e-05 


544353 


F-SPONDIN PRECURSOR 


3.5 \ 



2k i 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neishbor (BlastX vs. Non-Redundant Proteins) 


SEQ 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1228 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


8e-05 


483243 


apolipoprotein B- 100 - chicken 
(fragment) 


3.4 


1229 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


8e-05 


91207 


proline-rich protein - mouse 
(fragment) musculus] 


2.2 


1230 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


8e-05 


2499181 


ZONAJDHESIN PRECURSOR 
>gi| 1066466 


2.2 


1231 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


8e-05 


2499181 


ZONADHESIN PRECURSOR 
>gi|1066466 


1.9 


1232 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


2833647 


(AF027972) flagelliform silk 
protein fNephila clavipes] 


1.6 


1233 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


8e-05 


1 163063 


(Z49821) MY02 
[Saccharomyces cerevisiae] 


0.90 


1234 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


165348S 


(D90914) hypothetical protein 


0.30 j 


1235 


M26510 


Chicken nonmuscle 
myosin heavy chain 
(MHC) gene, 
complete cds. 


8e-05 


112159 


plectin - rat 


0.003 


1236 


U56402 


Human chromatin 
structural protein 
homo log 


8e-05 


2088823 


(AF003384) weak similarity to 
the peptidase family A2 


le-13 


1237 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


8e-05 


437181 


(U022S9) GTPase-activating 
protein [Caerorhabditis elegans] 


2e-17 


123S 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


8e-05 


465983 


HYPOTHETICAL 80.8 KD 
PROTEIN ZC21.4 IN 
CHROMOSOME III 


8e-27 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


















AF090115 


Lycopersicon 
escuientum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


7e-05 


<NONE> 


<NON^> 


<NONE> 


1240 


U83656 


Rattus norvegicus NF- 
KB gene, promotor 
re2ion 


7e-05 


3880858 


(AL03 1633) predicted using 
Genefinder; cDNA EST 
yk304f!2.5 comes from this 
gene [Caenorhabditis elegans] 


9.3 




AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


7e-05 


3080538 


(AL022600) hypothetical 
protein 


9.2 


1242 


XS9398 


H.sapiens ung gene 
for uracil DNA- 
glycosylase 


. 7e-05 


549700 


kYfOlhLLUCAL 15 J KOJ 
PROTEIN IN MDH1-VMA5 

TMTC T> /"* C NT rr~" DCPTAM 

>gi|539182|pirj|S37908 
hypothetical protein YKL083w - 
yeast (Saccharomyces 
cerevisiae) >gi|486l20 
(Z280S2) ORF YKL083w 


i.8 


L243 


M33753 


Bovine follicle 
stimulating hormone- 
beta subunit gene, 
complete cds. 


7e-05 


2398621 


(AJ000342) DiVIBTl protein, 
5.8 kb transcript [Homo sapiens] 


1.8 


1244 


M80829 


Rat troponin T 
cardiac isoform gene, 
complete cds 


5e-05 


854065 


(XS3413) USS [Human 
herpesvirus 6] 


2e-08 


1245 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


4e-05 


120240 


FLAGELLIN B2 PRECURSOR 

Methanococcus voltae 

>gi| 150063 (M7214S) tlagellin 


5.2 


1246 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-UD 


<N(JNh> 


<NUINh> 


<IN U1N fc> 


1247 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 


124S 


AF0743S6 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiahbor rBlastN vs. Gcnbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 






P VAT TTF 


r\K- \- C o o 1 \J P» 


DESCRIPTION 


P VAT TTF 






Rattus norvegicus 










1249 


AF093268 


homer- Ic mRNA, 
complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1250 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


3e-05 


2773226 


(AF039716) Similar to protein 
kinase [Caenorhabditis elegans] 


6.7 


1 tc | 
I J. J i 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-05 ! 


2072961 


(U93568) putative plot) [Homo 
sapiens! 


5.6 


1252 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP 17.6 
mRNA, complete cds 


3e-05 


121855 


hAUULUCAiNAot il 
PRECURSOR cellulose 1.4-beta 
cellobiosidase (EC 3.2.1.91) II 
precursor - rungus ^ 1 ncnuucrrna 
reesei) 1,4-beta-cellobiosidase 
(EC 3.2. 1.91) II- fungus 
cellobiohydrolase II 
[Trichoderma reesei] 


4.6 


1253 


U76524 


Sambucus nigra 
nbosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


3880516 


(AL021572) similar to CTP 
SYNTHASE (EC 6.3.4.2) (UTP- 
- AMMONIA LIGASE) (CTP 
SYNTHETASE; 


J.J 


1254 


M88299 


Mouse brain- 1 POU- 
domain protein, 
complete cds. 


3e-05 ! 


1947048 


(U66102) intimin [Escherichia 
colil 


3.0 


1255 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


3e-05 


3122872 


CELL-CYCLfc! NUCLEAR 
AUTO ANTIGEN SG2NA 

/C/r , "> NTT r (~"J tr .\ o AVTT^FNn 
(o/U^ rNUL-Lt..-vX A*N liLjri.Nj 

>gi| L 082650|pir]| JC2522 nuclear 
autoantigen - human >gi|S05095 
(U17989) GS2NA 


2.8 


1256 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


1352145 


CYTOCHROME C UXIDASb 
POLYPEPTIDE I chain I - 

Thermite minrinK *i»oil 1 'S'SOR^ 

(M84341) cytochrome c oxidase 
subunits precursor [Thermus 
thermophilusl 


2.6 


1-257 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA, complete cds 


3e-05 


281 1015 


SEGMENTATION POLARITY 
PROTEIN ENGRAILED 
>gi|2076747 (U42429) 
engrailed (Anopheles gambiae] 
>gi|21489l8 (U42214) 
engrailed [Anopheles gambiae] 


2.0 



OS i 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


UtotKir I 1UN 


P VALUE 
















1258 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


3e-05 


1657752 


(U62325) FE65-like protein 
[Homo sapiens! 


1.7 


1259 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-05 


2072961 


(U93568) putative pl50 [Homo 
sapiens] 


1.5 


1260 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


3e-05 


1352145 


CYTOCHROME C OXIDASE 
POLYPEPTIDE I chain I - 
Thermus aquaticus >gi|l5o08j 
(M84341) cytochrome c oxidase 
subunits precursor [Thermus 
thermophilus] 


1.1 


126 L 


X9L890 


H.sapiens regulatory 
region ofHOXA7 
gene 


3e-05 


111013 


Sxr (Bkm-homolog) sex- 
determining region protein - 
mouse 


1.0 


1262 


L36936 


Homo sapiens metase 
gene, partial cds. 


3e-05 


1944352 


(D84239) IgG Fc binding 
protein [Homo sapiens] 


0.99 


1263 


AB012105 


Brass ica rapa mRNA 
for SLG45, complete 
cds 


3e-05 


417782 


SMP2 PROTEIN 
>gi|320853|pir||S30911 SMP2 
protein - yeast (Saccharomyces 
cerevisiae) gene 
[Saccharomvces cerevisiael 


0.89 


1264 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


1708501 


INTEGRIN .ALPHA CHAIN- 
LIKE PROTEIN alpha Intlp 
[Candida albicans] 


0.39 


1265 


AF090115 


Lycopersicon 
esculenturn cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


3e-05 


15S7031 


cis-Golgi matrix protein GM130 
(Rattus norveaicus] 


0.20 


1266 


ZS1014 


Human DNA 
sequence from 
cosmid U65A4 t 
between markers 
DXS366 and DXS37 
on chromosome X * 


3e-05 


2072964 


(U93569) putative pl50 [Homo 
sapiens] 


0.049 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












glycosylated and mynstiiated 




1267 


296668 


H.sapiens telomeric 
DNA sequence, clone 
7PTEL0OI, read 
7PTELOO00Lseq 


3e-05 


542429 


smaller surtace antigen - 
Plasmodium falciparum 
>gi|836640 (X76298) 
glycosylated and mynstiiated 
smaller surface antigen gallus] 
>gi| 1092 178|prfl|2023 165B 
surface antigen 


0.029 


1268 


AdU 1 1 IUj 


Brassica rapa mRNA 
for SLG45, complete 
cas 


je-LO 


:>5/912I 


(Z70310) predicted using 
Gene finder; Similarity to Mouse 
ankyrin (PIR Acc. No. S37771); 
cDNA EST EMBL:T0I923 
comes from this gene; cDNA 
EST EMBL:D32335 comes 
from this sene; cDNA EST 
EMBL:D32723 comes from this 
gene; cDNA ES... Genefmder; 
Similarity to Mouse ankyrin 

\nt\ 1-VL.L. IN (J. O-J / / / 1 ) , LLJlNrV 

EST EMBL:T01923 comes 
from this gene; cDNA EST 
EMBL:D32335 comes from this 
gene; cDNA EST 
EMBL:D32723 comes from this 
gene; cDNA ES... 


2e-13 


1269 


AF074385 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


3e-05 ! 


2497677 


ZYXIN (ZYXIN 2) sapiens] 
>gi|l545954|gnI|PID|e223417 
(X95735) zyxin 


2e-23 


1270 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


le-05 


<NONE> 


<NONE> 


<NONE> 


127 1 


X16318 


Canine mRNA for 
signal recognition 
particle 54k protein 


le-05 


3122612 


PITUITARY HOMEOBOX 3 
(HOMEOBOX PROTEIN 
PITX3) >gi|2645427 
(.AF005772) homeobox protein 
Pitx3 fMus muscutus] 


4.4 


1272 


ABO 12 105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


le-05 


1652458 


(D90905) DNA mismatch repair 
protein MutL [Synechocystis 
sp-1 


0.62 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins ) ! 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1273 


U57843 


Human 

phosphatidy [inositol 
3-kinase delta 
catalytic subunit 
mRNA, complete cds 


le-05 


475909 


(X67098) ORF1A [Homo 
sapiens] 


0.22 


1274 


Z96569 


H.sapiens telomeric 
DNA sequence, clone 
2QTEL054, read 
2QTELOO054.seq 


le-05 


2137043 


unknown protein - rabbit 
(fragment) cuniculus] 


0.005 


1275 


AE000810 


Methanobacterium 
thermoautotrophicum 
from bases 172512 to 
1S2957 (section 16 of 
148) of the complete 
genome 


le-05 


3877579 


kinensin-like protein KIF4 
(SW:P33I74); cDNA EST 
EMBL:D27320 comes from this 
gene; cDNA EST 
EMBL:D27322 comes from this 
gene; cDNA EST 
EMBL:D27321 comes from this 
gene; cDNA EST 
EMBL:Djj764 comes... Mouse 
kinensin-like protein KIF4 
(SW:P33174); cDNA EST 
EMBL:D27320 comes from this 
gene; cDNA EST 
EMBL:D27322 comes from this 
gene; cDNA EST 
EMBL:D27321 comes from this 
gene; cDNA EST 
EMBL:D35764 comes... 


6e-27 


1276 


AB012113 


Homo sapiens gene 
for CC chemokine 
PARC precursor, 
complete cds 


9e-06 


<NONE> 


. <NONE> 


<NONE> 


1277 


AC005830 


Homo sapiens Xpi'2- 
154-155 BAC GSHB- 
52411 (Genome 
Systems Human BAC 
Library), complete 
sequence [Homo 
sapiens] 


9e-06 


<NONE> 


<NONE> 


<NONE> 


1278 


DS6245 


Human MHC (HLA) 
DRB intron 1 DNA, 
partial sequence 


9e-06 


1051253 


(U37531) mucin apoprotein 
[Mus musculus] 


1.3 


1279 


D79998 


Human rnRNA for 
KIAA0176 gene, 
partial cds 


9e-06 


2833253 


HYPOTHETICAL PROTEIN 
KIAA0 176 sapiens] 


4e-06 



<2_54 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
! ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(Z.&yo.oj Similarity to least 




1280 


U 10246 


Toxoplasma gondii 
RH uracil 
phosphonbosyl 
transferase gene, 
complete cds. 


9e-06 


3876090 


undine tcmase 

(SW:URK1_YEAST); cDNA 
EST EMBL:Z14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209hl.5 
comes from this se... 


7e-33 


12S1 


U 10246 


Toxoplasma gondii 
RH uracil 
phosphoribosyl 
transferase gene, 
complete cds. 


9e-06 


3876090 


(/^oyoJ^) Similarity to Yeast 
undine kinase 

(SW:URK1_YEAST); cDNA 
EST EMBL:Z 14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209h 1.5 
comes from this se... 


7e-34 


1282 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


8e-06 


<NONE> 


<NONE> 


<NONE> 


1283 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-06 


<NONE> 


<NONE> 


<NONE> 


1284 


U66340 


Human Rh blood 
group C antigen 
(RHCE) gene, exon 
2. partial cds 


8e-06 


1707155 


(U80837) F07E5.6 gene product 
[Caenorhabditis elegans] 


9.6 


1285 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-06 


<NONE> 


<NONE> 


<NONE> 


1286 


M29930 


Human insulin 
receptor (allele 2) 
gene, exons 14, 15, 
16 and 17. 


4e-06 


<NONE> 


<NONE> 


<NONE> 


1287 


L42103 


Homo sapiens 
(subclone 5_d3 from 
PI H25) DNA 
sequence. 


3e-06 


<NONE> 


<NONE> 


<NONE> 



ass" 
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| iNearest Nei2hbor 'BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










L288 


AFO 12244 


cerberus-like (Cer-1) 
gene, complete cds 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1289 


Z69366 


Human DNA 
sequence from 
cosmid L96F8, 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST. 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1290 


Z69366 


Human DNA 
sequence from 
cosmid Lvoro, 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST. 


3e-06 


<NONE> 


<NONh> 


<iNUNh> 


1291 


X85232 


H. sapiens 
chromosome 3 
sequences 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1292 


M32674 


Human platelet 
glycoprotein Ilia, 
exons 7. 8 and 9. 


3e-06 


<NUNh> 


<fNUfNt> 




1293 


D 16879 


Human HepG2 3' 
region cDNA, clone 
hmd2a01 


3e-06 


998296 


(U33484) ependymin 
(Hemiodus sp.] 


5.6 


1294 


U18614 


Lagothrix lagotncha 
interphotorecepior 
retinoid-binding 
protein (IRBP) gene, 
intron 1, complete 
sequence 


3e-06 


1613846 


(U71440) polyprotein [Rice 
tunsro spherical virus] 


5.0 


HQS 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 

complete cds 


3e-06 


1477646 


(U53204) plectin [Homo 

cnnif*nc1 ^>oi 1 1 J.776 S 1 (TTfilfilO') 
buuicribj 1 v / 

plectin [Homo sapiens] 


4.0 


1296 


AFO 16898 


Homo sapiens B-ATF 
gene, complete cds 


3e-06 


1085177 


reverse transcriptase - fruit fly 
reverse transcriptase 
[Drosophila yakuba] 


3.0 


1297 


ABO 1 8490 


Homo sapiens DNA, 
trinucleotide repeats 
re 2 ion 


3e-06 


3876572 


(Z81522) predicted using 
Genefinder; similar to RNA 
recognition motif, (aka RRM, 
RBD, or RNP domain) 
[Caenorhabditis elegansl 


3.0 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1298 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


3e-06 


4240137 


(AB020631) KIAA0824 protein 
[Homo sapiens] 


2.7 


1299 


M37929 


Homo sapiens 
adenosine 
monophosphate 
deaminase 1 
(AMPD1) gene, 
exons 1 1-12. 


3e-06 


1653775 


(D90916) thiohdisuIFide 
interchange protein DsbD 
[Synechocvstis sp.l 


1.7 


1300 


M37929 


Homo sapiens 
adenosine 
monophosphate 
deaminase 1 
(AMPD1) gene, 
exons 1 1-12. 


3e-06 


1653775 


(D90916) thiohdisulfide 
interchange protein DsbD 
[Svnechocystis sp.l 


1.7 


1301 


U60496 


Glycine max actin 
(Soy86) gene, partial 
cds 


3e-06 


1730738 


ACTIN-LIKE PROTEIN ARP5 
Ynl2430p [Saccharomyces 
cerevisiael 


2e-05 


1302 


X14363 


Yersinia 

pseudotuberculosis 
rpiC, rpID, rpIW, 
rplB and rpsS genes 
for ribosomal proteins 
L3, L4. L23, L2 and 
S19 


3e-06 


585879 


50S RIBOSOMAL PROTEIN 
L2 rnaritima >gi|437926 
(Z21677) ribosomal protein L2 


2c- 12 


1303 


Z34969 


H.sapiens DNA for 

microsutellite 

polymorphism 


2e-06 


<NONE> 


<NONE> 


<NONE> 


1304 


X64707 


H.sapiens BBC1 
mRNA 


le-06 


<NONE> 


<NONE> 


<NONE> 


1305 


AC005830 


Homo sapiens Xp22- 
154-155 BAC GSHB- 
52411 (Genome 
Systems Human BAC 
Library), complete 
sequence [Homo 
sapiens] 


le-06 


<NONE> 


<NONE> 


<NONE> 


1306 


J04058 


Human electron 
transfer fla\ oprotein 
alphu-subunit mRNA. 
complete cds. 


le-06 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BtastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1307 


L25647 


Homo sapiens 
fibroblast growth 
factor receptor gene 
(located in the central 
MHC) signal peptide 
and consecutive exon 


le-06 


1586734 


mxcQ gene [Methylobacterium 
organophilum] 


5.4 


1308 


L26261 


Human MHC class III 
HLA-RP1 gene. 


Ie-06 


1684985 


(U20633) NADH 
dehydrogenase subunit 
[Neuwiedia veratrifolia] 


1.3 


1309 


AF002283 


Mus musculus alpha- 
actinin-2 associated 
LIM protein mRNA, 
alternatively spliced 
product, complete cds 


le-06 


2996196 


(AF053367) carboxyl terminal 
LIM domain protein [Mus 
musculus] 


4c- 17 


1310 


Ml 0935 


Human haptoglobin 
gene (alpha-2 allele), 
complete cds and 
haptoglobin-related 
gene, exon 1 and 
three Alu repeats. 


6e-07 


<NONE> 


<NONE> 


<NONE> 


1311 


AC00225 1 


Homo sapiens 
(subclone l_g6 from 
BAC H76) DNA 
sequence 


4e-07 


2144491 


coagulation factor Xa (EC 
3.4.21.6) precursor norvegicus] 


4.2 


1312 


AF047717 


Streptomyces 
chrysomallus 
actinomycin 
synthetase II (acmB) 
gene, complete cds 


4e-07 


699196 


(U151S1) 4-coumarate-coA 
liaase [Mvcobacterium leprae] 


le-06 


1313 


U144I7 


Human Ral guanine 
nucleotide 
dissociation 
stimulator mRNA, 
partial cds. 


4e-07 


544402 


UU AiN liNfc. CN ULLtU 1 IJJt, 

DISSOCIATION 
STIMULATOR RALGDS ! 
FORM A (RALGEF) 
>gi|3212:>7|pir||S2841:> guanine 
nucleotide dissociation 
stimulator ralGDS - mouse 
>gi| 193573 (L07924) guanine 
nucleotide dissociation 
stimulator [Mus musculus] 


8e-0S 


1314 


Z79027 


H. sapiens flow-sorted 
chromosome 6 
Hindlll fragment. 
SC6pA20G8 


3e-07 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


! Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1315 


U67167 


intestinal mucin 
(MUC2) gene, 
promoter region and 
partial cds 


3e-07 


<NONE> 


<NONE> 


<NONE> 


1316 


AF086256 


Homo sapiens full 
length insert cDNA 
clone ZD41C11 


3e-07 


<NONE> 


<NONE> 


<NONE> 


1317 


U67228 


Human clone HS4.61 
Alu-Ya5 sequence 


3e-07 


1938437 


(U97003) contains similarity to 
C4-type zinc fingers and a 
Iigand-binding domain of 
nuclear hormone receptors 


2.3 


1318 


U94346 


Human calpain-like 
protease (htra-3) 
mRNA, complete cds 


3e-07 


2911858 


(AF047659) No definition line 
found [Caenorhabditis elegansl 


0.39 ; 


1319 


Y15724 


Homo sapiens 
SERCA3 gene, exons 
1-7 {and joined CDS) 


le-07 


<NONE> 


<NONE> 


<NONE> 


1320 


X13596 


Bean DNA tor 
glycine-rich cell wall 
protein GRP 1-8 


le-07 


<NONE> 


<NONE> 


<NONE> 


1321 


MS 3094 


Homo sapiens 
cytosolic selenium- 
dependent glutathione 
peroxidase gene, 
complete cds, and 
rhohl2gene, 3' end. 


le-07 


1326385 


(U5875DC07GL7 gene 
product [Caenorhabditis 
elegans] 


8.0 


1322 


Z55905 


H.sapiens CpG DNA, 
clone 71 f4, forward 
readcpg71f4.ftla . 


le-07 


1076802 


extensin-like protein - maize 
>gi|600H8 mays] 


0.61 


1323 


X03541 


Human rnRNA of trk 
oncogene > :: 
gb|I96186|I96186 
Sequence 23 from 
patent US 5734039 


le-07 


325465 


(M74509) [Human endogenous 
retrovirus type C oncovirus 
sequence.], gene product [Homo 
sapiens] 


3e-04 


1324 


AF027766 


Canis familiaris Y- 
linked zinc finger 
protein 


le-07 


220643 


(D1062S) zinc finger protein 
Mus musculus] 


7e-08 


1325 


D13613 


Bovine mRNA for 
rabphilin-jA, 
complete cds > :: 
dbj|E07S09|E07809 
cDNA encoding 
rabphilin-3A 


le-07 


2S22161 


(AC0040S2) rab3 effector-like; 
35% Similarity to AF007S36 
(PID:g23 17778) [Homo 
sapiens] | 


6e-l 1 
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Nearest Neighbor {BlastN vs. Genbank) 


! Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


' ACCESSION 


Dh^CKLr I 1UN 


P VALUE 






Human mRNA for c- 






(J04169) gas-one fusion protein 




1326 


X57110 


cbl proto-oncogene 


le-07 


323270 


(Cas NS 1 retrovirus 1 


3e-I4 


1327 


X57U0 


Human mRNA for c- 
cbl proto-oncogene 


le-07 


1 15855 


PROTO-ONCOGENE C-CBL 
human >gi|2973 i (X571 10) c- 
cbl protein [Homo sapiens] 


4e-19 


1328 


AC001178 


Homo sapiens 
(subclone 2_gl2 from 
BAC H94) DNA 
sequence 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1329 


U11866 


Human interleukin-8 
receptor type B 
(IL8RB) gene, 
promoter and exons 1- 
6 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1330 


AC001225 


Homo sapiens 
(subclone 2_e6 from 
BAC H94) DNA 
sequence 


4e-0S 


478184 


histone HI II- 1 (clone L95) - 
midae 


6.5 


1331 


M73837 


Human modulator ! 
recognition factor 2 
(MRF-2) mRNA, 
complete cds. 


4e-08 


14144S 


HYPOTHETICAL 32.6 KD 
PROTEIN IN TRANSPOSON 
TN4556 >gi|80758|pir||JQ0428 
hypothetical 32. 6K protein - 
Streptomyces fradiae transposon 
Tn4556 


4.7 


L332 


AC006164 


Homo sapiens clone 
UWGC:y28gap from 
6p2 1, complete 
sequence [Homo 
sapiens] 


4e-08 


2580578 


(AF000996) ubiquitous TPR 
motif, Y isoform [Homo 
sapiens] 


1.2 


1333 


X01060 


Human mRNA for 
transferrin receptor 


4e-0S 


135514 


T-CELL RECEPTOR BETA 
CHAIN PRECURSOR 
precursor (ANA 1 1) - rabbit 


0.61 


1334 


Y 10697 


H.sapiens INE2 
mRNA 


4e-0S 


124909 


INSULIN RECEPTOR- 
RELATED PROTEIN 
PRECURSOR (IRR) (IR- 
RELATED RECEPTOR) 
>gi| 186555 sapiens] 


0.14 


1335 


U60416 


Rattus norvegicus 
myr 6 myosin heavy 
chain mRNA, 
complete cds 


4e-0S 


102189 


myosin I, high molecular weight 
- Acanthamoeba sp 


3e-0S 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPOTHJbl'lLAL d5.2 KD 




1336 


U23804 


Drosophila 
melanogaster putative 
GTP-binding_ 
regulatory protein 
beta chain (GPB) 
mRNA. panial cds. 


4e-08 


2494916 


TRP-aSP RKPKa'K 
CONTAINING PROTEIN 
T10F2.4 IN CHROMOSOME 
III protein; similar to G-Beta 
repeat region (Trp-Asp 
domains) of guanine nucleotide 
binding protein 


le-28 


1337 


AE000213 


Escherichia coli K-12 
MG1655 section 103 
of 400 of the 
complete genome 


4e-08 


. 3294172 


(AL022325) rF27C3.1.1 
(protein similar to C. elegans 
protein B0035.16) (isoform 1) 
[Homo sapiens] 


2e-67 


1333 


D89821 


Mus musculus mRNA 
for RhoM, complete 
cds 


2e-08 


3024539 


RHO-RELATED GTP- 
BINDING PROTEIN RHOD 
(RHO-RELATED PROTEIN 
HP I) (RHOHP1) sapiens] 


le-04 


1339 


U74382 


Human teiomeric 
repeat DNA-binding 
protein (PIN2) 
mRNA, complete cds 


le-08 


<NONE> 


<NONE> 


<NONE> 


1340 


L35657 


Homo sapiens 
(subclone H8 5_al0 
from Pt 35 H5 CS) 
DNA sequence. 


le-08 


<NONE> 


<NONE> 


<NONE> 


1341 


L21936 


Human succinate 
dehydrogenase 
flavoprotein subunit 


le-08 


3201678 


(AF060886) adenine 
phosphoribosyltransferase 
[Leishmania tarentolae] 


4.0 


1342 


AB009777 


Homo sapiens gene 
for osteonidogen, 
promoter region 


le-08 


479388 


tritin - wheat 

>gi|391929|gnI|PID|d!003454 


2.2 


1343 


M58600 


Human heparin 
cofactor II (HCF2) 
gene, exons I through 
5. 


le-08 


1730173 


GLUCObE-6-PHOSPHATE 
ISOMERASE, CYTOSOLIC 2 
(GPI) (PHOSPHOGLUCOSE 
ISOMERASE) (PGI) isomerase 
[Clarkia concinna] 


1.9 


1344 


M58600 


Human heparin 
cofactor II (HCF2) 
gene, exons I through 
5. 


le-08 


1730173 


GLUCOSE-6-PHOSPHATE 
ISOMERASE, CYTOSOLIC 2 
(GPI) (PHOSPHOGLUCOSE 
ISOMERASE) (PGI) isomerase 
[Clarkia concinna] \ 


1.7 


L345 


AC000980 


Homo sapiens 
(subclone l_g2 from 
PI H31) DNA 
sequence 


le-08 


439877 


(L2742S) reverse transcriptase 
[Homo sapiens] 


LI 
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Nearest Neighbor fBiastN v s. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) j 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1346 


U48734 


Human non- muscle 
alpha-actinin mRNA, 
complete cds 


le-08 


168237 


(M76546) hydroxyproline-rich 
protein [Helianthus annuus] 


0.19 


1347 


M76724 


Human leukocyte 
adhesion receptor 
alpha subunit 


le-08 


1177607 


(X92485) pval [Plasmodium 
vivaxl 


0.19 


1348 


AF067959 


Gallus gallus 
homeodornain protein 
HOXD-3 mRNA, 
complete cds 


le-08 


■ ■ 3165574 


(AF067942) No definition line 
found [Caenorhabditis elegans] 


0.15 


1349 


Z81014 


Human DNA 
sequence from 
cosmid U65A4, 
between markers 
DXS366 and DXS87 
on chromosome X * 


le-08 


2072964 


(U93569) putative pl50 [Homo 
sapiens] 


0.001 


1350 


X57103 


Human h-iys gene for 
lysozyme (upstream 
region) 


7e-09 


<NONE> i 


<NONE> 


<NONE> 


1351 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-09 


231629 


BILE-SALT- ACTIVATED 
LIPASE PRECURSOR ESTER 
LIPASE) (STEROL 
ESTERASE) (CHOLESTEROL 
ESTERASE) salt-activated 
lipase [Homo sapiens] sapiens] 


0.22 


1352 


L34741 ! 


Aplysia californica 
prohormone 
convenase (PC2) 
mRNA, complete cds. 


5e-09 


322054 


cytochrome-c oxidase (EC 
1.9.3.1) chain II precursor - 
Synechocystis sp. (PCC 6803) 
>gi|581739 sp.] 


5.0 ! 


1353 


AF052959 


Homo sapiens type 
XV collagen 
(COL15A1) gene, 
exon 6 


4e-09 


131269 


PHOTOS YSTEM II P680 
CHLOROPHYLL A 
APOPROTEIN (CP-47 
PROTEIN) 

>gi|7270S|pir||QJLV6A 
photosystem II chlorophyll a- 
binding protein psbB - liverwort 
(Marchantia polymorpha) 
chloroplast >gi| 11700 


1.8 
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Nearest Neighbor iBIastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1354 


L 15470 


Streptomyces 
ciavuligerus (NRRL 
3585) clavulanic acid 
biosynthesis protein 
(cla) gene, complete 
cds and clavaminate 
synthase 2 (cs2) gene, 
partial cds. 


4e-09 


586028 


• (AGMATLNE — 

UREOHYDROLASE) (AUH) 
(PROCLAVAMINIC ACID 
AMIDINO HYDROLASE) 
>gi|1361423|pir||S57669 
Proclavaminic acid amidino 
hydrolase - Streptomyces 
ciavuligerus >gi|295 171 
Proclavaminic acid amidino 
hydrolase [Streptomyces 
ciavuligerus] 

>gi|1586122|prtl|2203286B 
proclavaminic acid amidino 
hydrolase [Streptomyces 
clavuligerusl 


4e-13 


1355 


AB002302 


Human mRNA for 
KIAA0304 gene, 
complete cds 


2e-09 


131600 


GENERAL SECRETION 
PATHWAY PROTEIN L 
product [Klebsiella pneumoniae] 
>gi|1493U CM32613) pulL 


2.5 


1356 


L34219 


Homo sapiens 
retinaldehyde-binding 
protein (CRALBP) 
gene, complete cds. 


Le-09 


<NONE> 


<NONE> 


<NONE> 


1357 


AB002302 


Human mRNA for 
KIAA0304 gene, 
complete cds 


Ie-09 


2224549 


(AB002302) KIAA0304 [Homo 
sapiens] 


5.0 


1358 


D85731 


Homo sapiens 
HSPA1L gene for 
Heat shock protein 70 
testis variant, 5'UTR, 
partial sequence 


le-09 


1389766 


(U5865S) unknown [Homo 
sapiens] 


1.3 


1359 


AF0644S3 


Homo sapiens natural 
resistance-associated 
macrophage protein 2 
(NRAMP2) gene, 
exon 17, alternatively 
spliced non-ERE 
form, complete cds 


Se-10 


113671 


!!!! ALU CLASS F WARNING 
ENTRY !!!! 


0.72 
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Nearest Neiahbor CBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) ] 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1360 


AF002283 


Mus musculus aipha- 
actinin-2 associated 
LIM protein mRNA, 
alternatively spliced 
product, complete cds 


6e-l0 


2996196 


(AF053367) carboxyl terminal 
LIM domain protein [Mus 
musculus] 


4e-21 


1361 


M26220 


African green 
monkey origin of 
replication 


5e-10 


2143455 


gene DMR-N9 protein - mouse 
(fragment) 


8.8 


1362 


Z78006 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment, 
SC6pA7F10 


4e-10 


2072977 


(U93574) putative pl50 [Homo 
sapiensl 


0.005 


1363 


U82303 


Homo sapiens 
unknown protein 
mRNA, partial cds 


2e-10 


1825711 


(U88183) similar to the 
immunoglobulin superfamily, 
most similar to nerual cell 
adhesion proteins 
[Caenorhabditis elegans] 


0.031 


L364 


AF079764 


Drosophila 
melanogaster 
enhancer of 
polycomb 


2e-10 


3757890 


(AF079764) enhancer of 
polycomb [Drosophila 
melanogaster] 


ie-10 


1365 


L24123 


Homo sapiens NRF1 
protein (NRF1) 
mRNA. 


2e-I0 


3004573 


(AC004520.) similar to NFE2- 
related transcription factors; 
similar to I4S694 
(PID:g2 137676) [Homo 
sapiens] 


4e-53 


1366 


M91454 


Orangutan alpha- 
globin gene duplicate 
region. 


le-10 


464239 


NADH- UBIQUINONE 
OXIDOREDUCTASE CHAIN 
4>gi|10S5lS5|pir||S5296S 
NADH dehydrogenase chain 4 - 
honeybee mitochondrion 
(SGC4) >gi|552446 (L06178) 
NADH dehydrogenase subunit 4 
[Apis mellifera ligustica] 


6.0 


1367 


D87L17 


House mouse; 
Musculus domesticus 
brain mRNA for 
SAP102, complete 
cds 


6e-U 


473912 


(L3196I) phosphoprotein [Mus 
cookii] 


2.2 


1363 


AC001002 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5e-ll 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ID 


1 Nearesi 
1 

lACCESSIOf 


Neighbor (BlastN vs. < 
V DESCRIPTION 


uenbank) 
P VALUE 


1 Nearest Nei eh 
| ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


rotems) 
P VALUE 


1369 


1 (subclone 2_h9 trom 
I P1H39)DNA 
1 AC001002 sequence 


5e-ii 


I <NONE> 


<NONE> 


<NONE> 


1370 


| Homo sapiens 

1 IKIAA0414 mRNA, 

1 AB007874 partial cds 


5e-Il 


1 <NONE> 


! <NONE> 


<NONE> 


1371 


1 AC001002 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
(sequence 


5e-ll 


1 <NONE> 


<NONE> 


<NONE> 


1372 


AC001002 


IHomo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5e-Ll 


<NONE> 


<NONE> 


<NONE> 


1373 


AC001002 


IHomo sapiens 

(subclone 2_h9 from 

PI H39) DNA 
(sequence 


5e-ll 


<NONE> 


<NONE> 


<NONE> 


1374 


AC001002 


IHomo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5e-ll | 


<NONE> 


<NONE> 


<NONE> 


1375 


Z21852 


H saDiens mRN'A fnr 
HERV-K long 
terminal repeat 


5c- 11 1 


419481 


gag polyprotein - human 
endogenous virus S71 


4.6 


13761 


AB007928 


Homo sapiens mRNA 
for KIAA0459 
protein, partial cds 


Sell 


2947238 


(AF05 1782) diaphanous 1 
Homo sapiens 1 


2.8 


1377 1 


DS7117 


House mouse; 
Musculus domesticus 
brain mRNA for 
SAP 102, complete 
cds 


^ i i 1 
oe-ll i 


( 

473912 c 


L31961) phosphoprotein [Mus 
:ookii] 


l.S 


13781 


AJ131501 | 


Homo Sapiens DNA 
sequence between 
.wo AML 1 gene 
Dromoters, 6423 BP 


5e-ll I 


? 

728831 \ 


!!! ALU SUBFAMILY J 
VARNING ENTRY 


0.20 


1379| 


I 

jr 

M27S26 |r 


-tuman endogenous 
etroviral protease 
nRNA, complete cds. 


5e-ll J 


r 

8855S 


etroviral proteinase-like protein 
human 


0.002 
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SEQ 
ID 


| Nearest 
> 1 

1 ,\ P/°cccrr\\ 

1 ALLhbMOi^ 


Neighbor (BlastN vs. ( 
J DcSCKlrllON 


jenbank) 
P VALUE 


i Nearest Neiah 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins> 
P VALUE 


1380 


I U23804 


Drosophila 
melanogaster putative 
GTP-bi nding 
regulatory protein 
beta chain (GPB) 
mRNA. partial cds. 


5e-ll 


2494916 


HVP01HE11CAL DD.2 KJ3 
TkP-ASP REPEATS 
CONTAINING PROTEIN 
T10F2.4 IN CHROMOSOME 
III protein; similar to G-Beta 
repeat region (Trp-Asp 
domains) of guanine nucleotide 
binding protein 


le-30 


1381 


I Z22784 


M.musculus troponin 
I sene. 


3e-ll 


3892202 


(AF072889) transcription 
repressor brain factor 2 


0.053 


1382 


J AB007880 


Homo sapiens 
KIAA0420 mRNA, 
complete cds 


2e-ll 


<NONE> 


<NONE> 


<NONE> 


1383 


AF020361 


9 Homo sapiens BAX 
gene, exon 6, partial 
sequence 


2e-ll 


<NGNE> 


<NONE> 


<NONE> 


1384 


! L35600 


Homo sapiens DNA 
sequence. 


2e-ll 


1174952 


GLYCOPROTEIN D 
PRECURSOR gD [Bovine 
herpesvirus l] 


0.25 


1385 


U21943 


Human organic anion 

transporting 

polypeptide 


2e-l 1 


2738223 


(U9501 1) brain-specific organic 
anion transporter 


9e-19 


1386 


U90878 


Homo sapiens 
carboxyl terminal 
LIM domain protein 


2e-ll 


2996196 


(AF053367) carboxyl terminal 
LIM domain protein [Mus 
musculus] 


4e~23 


1387 


U31929 


Human orphan 
nuclear receptor 
(D AX I) gene, 
complete cds 


6e-L2 


<NONE> 


<NONE> 


<NONE> 


1388 I 


M25828 


-luman von 
Willebrand factor 
gene, exon 1. 2, and 
3, and three Alu 
repetitive elements. 


6e-12 


<NONE> 


<NONE> 


<NONE> 


1389 1 


AB020648 j 


Homo sapiens mRNA 
tor KIAA0S4 1 
Drotein. partial cds 


3e-l2 


<NONE> 


<NONE> 


<NONE> 


1390 


1 

t 
i 

( 

Z15026 1 


-i. sapiens genes for 
umor necrosis factor 
Tnfa) and 

ymphotoxine (Tnfb) 


2e-12 


<NONE> 


<NONE> 


<NONE> 


1391 j 


I 

k 
e 

L28101 c 


-lomo sapiens 
callistatin (PI4) gene, 
ixons 1-4, complete 
ds 


2e-12 


<NONE> 


<NONE> 


<NONE> 


1392 1 


I 

Z47046 C 


•luman cosmid 
?LL2C9 from Xq2S 


2e-12 


<NONE> 


<NONE> 


<NONE> 
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1 Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redunrinnr Prnrpins* 


ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






H. sapiens tlow-sortec 


t 








1393 


Z79007 


chromosome 6 
Hindlll fragment, 
SC6pA20E2 


2e-12 


106322 


hypothetical protein (L1H 3' 
resion) - human 


1.5 


1394 


U34377 


Human tyrosine 
kinase TXK (txk) 
gene, exon 13. 


le-12 


151484 


(M55524) ORE 4; putative 
[Pseudomonas aeruainosal 


4.3 


1395 


D70845 


Mus muscuius apg-1 
gene for novel 
member of heat shock 
protein 110, promoter 
region 


le-12 


113658 


ALKALINE PROTEINASE 
PRECURSOR (ALP) precursor 
fungus (Acrernonium 
chrysogenum) 


3.5 


1396 


M63978 


Human vascular 
endothelial growth 
factor gene, exon 8. 


le-12 


3982737 


(AF069731) calmodulin- 
dependent protein kinase II beta 
M isoform [Rattus norvesicus] 


0.083 


1397 


U60266 


Homo sapiens 
lysosomal alpha- 
mannosidase (manB) 
mRNA. complete cds 


8e-13 


<NONE> 


<NONE> 


<NONE> 


1398 


Z68297 


Caenorhabditis 
elegans cosmid 
F11A10. complete 
sequence 
[Caenorhabditis 
elegans] 


7e-13 


2393734 


(AC002542) similar to C. 
elegans Fl 1A10.5; $0% 
similarity to Z68297 
(PIDrgl 130619) [Homo 
sapiens] 


5e-34 


1399 


Z68297 


Caenorhabditis 
elegans cosmid 
F11A10, complete 
sequence 
Caenorhabditis 
eiegansl 


7e-13 


2393734 


(AC002542) similar to C. 
elegans FI 1A10.5; 80% 
similarity to Z68297 
(PID-oI 1306191 THomo 
sapiens] 


3e-38 


1400 


Z68885 


Human DNA 
sequence from 
cosmid L21F12B, * 
Huntington's Disease 
Region, chromosome 
4pl6.3. contains 
EST. 


6e-13 


<NONE> 


<NONE> 


<NONE> 


1401 


X76104 


^.sapiens DAP- 
kinase mRNA 


6c- 13 


( 

2911154 i 


AB007143) ZIP-kinase [Mus 
nusculus] 


0.007 \ 


1402 


j 
( 

Z7S66S * 


H. sapiens Mow-sorted 
;hromosome 6 Taql 
ragment. 
3C6pA13G4 


5e-I3 


\ 

106322 r 


lypothetical protein (L1H 3' 
egion) - human 


2e-06 


1403 


I 

L35600 s 


-lomo sapiens DNA 
sequence. 


3e-I3 


< 

3184290 f 


AC004136) hypothetical 
)rotein [Arabidopsis thaliana] 


1.7 
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Nearest 


Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor l BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Cloning vector 










1404 


AF090452 


pKODT complete 
sequence 


2e-L3 


3876730 


(Z49966) F35C1L4 
[Caenorhabditis eleaansl 


7.8 


1405 


D28126 


Human gene tor A TV 
synthase alpha 
subunit, complete cds 
Cexon I to 12) 


2e-13 


419481 


gag polyprotein - human 
endogenous virus S71 


3.4 


1406 


AF005219 


Homo sapiens 
transcription factor 
HOXD13 


2c- 13 


2822166 


(AC004080) transcription factor 
HOXA13 [Homo sapiens] 


5e-09 


1407 


AB018301 


Homo sapiens mRNA 
forKIAA0758 
protein, partial cds 


2e-13 


3882237 


(AB018301) KIAA0758 protein 
[Homo sapiens! 


ie-23 


1408 


D70845 


Mus musculus apg- 1 
gene for novel 
member of heat shock 
protein 1 10, promoter 
reaion 


le-13 


113658 


ALKALINE PROTEINASE 
PRECURSOR (ALP) precursor - 
fungus (Acremonium 
chrysogenum) 


3T 


1409 


AG00069 1 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
T171BG33 


8e-14 


930045 


(X15332) alpha- 1 (III) collagen 
[Homo sapiens 1 


3e-04 


1410 


D30785 


Mouse mRNA for 
neuropsin, complete 
cds 


8e-14 


3559978 


(AJ005641) serine protease 
Rattus rattus] 


2e-12 


1411 


U32710 


Haemophilus 
influenzae Rd section 
25 of 163 of the 
complete genome 


8e-14 


4106673 1 


(AL035064) queuine trna- 
ri bosy Itrans f erase 
Schizosaccharomyces pombe] 


2e-38 


1412 


AG000886 


Homo sapiens 
genomic DNA, 2Iq 
region, clone: 
64E11X19 


7e-l4 


1363925 


hypothetical protein 2 - North 
American opossum (fragment) 
>gi|897721 (Z48955) ORF-2, 
putative RT [Didelphis 
virginiana) 


LI 


1413 


Z62664 


H.sapiens CpG DNA, 
clone 7 1 d 1 1 , forward 
read cpg71dl l.ftla . 


7e-14 


3953461 


[AC00232S) F20N2.6 
Arabidopsis thaliana] 


0.085 1 


1414 


AB014532 


Homo sapiens mRNA 
for KIAA0632 
protein, partial cds 


7e-14 


113668 ] 


!!! ALU CLASS C WARNING 
ENTRY !!!! 


0.040 
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Nearest Neighbor fBIastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1415 


Z96478 


H.sapiens telomeric 
DNA sequence, clone 
20PTEL004, read 
20PTEL0O004,seq 


7e-14 


2981631 


(ABO 12223) ORF2 [Canis 
familiaris] 


2e-04 


1416 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-14 


<NONE> 


<NONE> 


<NONE> 


1417 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


4e-14 


<NONE> 


<NONE> 


<NONE> 


1418 


AF033349 


Homo sapiens MLL 
gene breakpoint 
cluster region, intron 
I, partial sequence 


3e-14 


72883 1 


HI! ALU SUBFAMILY J 
WARNING ENTRY 


9,3 


1419 


AC001526 


Homo sapiens 
(subclone 4_f6 from 
PL H54) DNA 
sequence 


3e-14 


99861 


extensin - almond >gi 20420 i 
(X65718) extensin 


9.2 


1420 


AFO 1 2899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-l4 


728832 


!!!! ALU SUBFAMILY SB 
WARNING ENTRY 


0.15 


1421 


AF 100694 


Mus musculus ' 
Pontin52 mRNA, 
complete cds 


2e-l4 


3913573 


EPHRIN-A2 PRECURSOR 
(EPH-RELATED RECEPTOR 
TYROSINE KINASE LIGAND 
6) (LERK-6) sapiens] 
>gi|2924761 (AC004258) 
EPL6 HUMAN [Homo sapiens] 


8.7 


1422 


AF012S99 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


9c- 15 


119040 


LIB FKOTEIN SMALL f- 
ANTIGEN (E1B 19K) 
>gi|74l42|pir||QlAD25 early 
E IB 2 IK protein II - human 
adenovirus 5 >gi|584S9 
(X02996) mRNA 5 first reading 
frame [Human adenovirus type 
5] adenovirus type 5] 
>gi|209797 (JO 1969) 2L kD 
protein 


1.5 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












transcription factor GATA-4, 




1423 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-15 


477102 


retinoic acid-inducible - mouse 
>gi|293345 (M98339) GATA- 
binding transcription factor 
[Mus musculusl 


0.57 


1424 


ABO 1 2223 


Canis familiaris LINE 
1 element ORF2 
mJRNA, complete cds 


8e-l5 


92385 


hypothetical protein - rat 
(fragment) 


0.003 


1425 


AF 100694 


Mus musculus 
Pontm52 mRNA, 
complete cds 


3e-l5 


<NONE> 


<NONE> 


<NONE> 


1426 


X12433 


Human pHS 1-2 
mRNA with ORF 
homologous to 
membrane receptor 
proteins 


3e-15 


422532 


collagen alpha 3(IV) chain - sea 
urchin 


8.9 


1427 


AFO 12 899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-L5 


1353143 


PROBABLE NUCLtlAk 
HORMONE RECEPTOR 
E02H1.7 

>gi|3875431|gnl|PED|e 1344980 
(Z47075) similar to Zinc finger, 
C4 type (two domains) 
[Caenorhabditis elegans] 


5.0 


1428 


Z69651 


Human DNA 
sequence from 
cosmid L75B9. 
Huntington's Disease 
Region, chromosome 
4pl6.3 


3e-l5 


403460 


(L24521) transformation-related 
protein [Homo sapiens! 


0.60 


1429 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-15 


108750 


Ig heavy chain precursor 
(B/MT.4A.17.H5.A5) - bovine 
>gi|440 (X62916) anti- 
testosterone antibody [Bos 
taurus] 


1.1 


1430 


X83299 


H. sapiens SMA3 
mRNA 


2e-15 


671530 


(X83299) SMA3 gene product 
[Homo sapiensl 


0.32 


1431 


U01377 


Human p300 protein 
mRNA, complete cds. 
> :: gb|I62297|I62297 
Sequence 1 from 
patent US 565S7S4 


2e-15 


3024341 


El A- ASSOCIATED PROTEIN 
P300 


0.019 
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SEC 
ID 


j Nearest 

2 

lACCESSIOr 


Neighbor (BlastN vs. 
J DESCRIPTION 


Genbank) 
P VALUE 


J Nearest Neigh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 1 
DESCRIPTION |p VALUE l 


1432 


X16516 


Mouse MHC (Qa) Ql 
k gene for class I 
antigen, exons 4-8 


> 

le-15 


1 2496897 


in i ru i rt£ 1 1L,.-vl i n.i,j 

PRUltliN L16UU.6 W 

CHROMOSOME III 
>gi|3874384|gnI|PED|e 1344078 
EST EMBL:C08256 comes 
from this gene; cDNA EST 
EMBL:C09941 comes from this 
gene; cDNA EST yk340al0.3 
comes from this gene; cDNA 
EST yk340al0.5 comes from 
this gene [Ca... 


1 

i 7e-08 1 


1433 


! M74165 


Chicken tensin 
mRNA. complete cds 


le-l'S 


J 283920 


tensin - chicken >gi|2 12752 
(M74165) tensin 


2e-19 j 


1434 


X71893 


IH.sapiens gene tor 
(immunoglobulin 
kappa light chain 
variable region 04 
and 05 


9e-16 


| <NONE> 


! <NONE> 


<NONE> 


1435 


U05227 


Human Rar protein 
mRNA, complete cds. 


9c- 16 


3036779 


|V^o*f*f /y; matcn: multiple 
proteins; match: 000407 
Q12829 P22127 P36861 
Q40219; match: P70550 
Q41022 P22125 Q08155 
P35286; match: P5U4S P51147 
P35293 P36861 P352S9; match: 
P352S4 Q402I7 P51152 
P51157 P51158; match: Q41022 


3e-06 j 


1436 I 


M23404 


Chicken erythrocyte 
anion transport 
protein (band3) 
mRNA, complete cds. 


9e 16 J 


J 

726403 


(U23175) similar to anion 1 
exchange protein j 
r Caenorhabditis elegans] | 


le-28 


1437 I 


I 

I 

X16145 : 


^at mRNA for liver a- 
--Fucosidase (EC 
5.2.1.51) 


9e-16 1 


1 

67502 U 


llpha-L-rucosidase (EC f 
3.2.1 511 1 nrecursnr ri^Qiip i 
luman >gi| 178409 (M29S77) 
llpha-L-fucosidase precursor 
EC 3.2. 1 .5) [Homo sapiens] | 


2e-29 1 


1438 1 


r 

F 

AF012899 n 


Jambucus nigra 
ibosome inactivating 
rotein precursor 
nRNA, complete cds 


8e-l6 


<NONE> 


<NONE> 1 


<NONE> I 


1439| 


r 

n 
P 

0 

AF0769SI |c 


4us musculus brain 
litochondrial carrier 
rotein BMCPl 
Bmcpl) mRNA, 
omplete cds 


8e-16 


0 

3851540 ]c 


AF078544) brain mitochondrial 
irrier protein- 1 [Homo sapiensll 


2e-I3 1 



>L1 9 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRLr I lUN 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






H.sapiens MN/CA9 






!!!! ALU SUBFAMILY J | 


1440 


! Z54349 


GENE 


5e-i6 


72883 1 


WARNING ENTRY 


0.002 


1441 


AF077003 


Mas muscuius SH3 
domain-containing 
adapter protein 
mR_NA, completecds 


3e-16 


| 309123 


(M35526) complement 
component C5D [Mus 

miiQfiiln^l 




1442 


X64587 


M. muscuius mRNA 
for splicing factor 
U2AF(65 kD) 


3e-16 


2143767 


glycoprotein - rat >gi|986943 
(L08134) glycoprotein [Rattus 
norveaicusl norvegicus] 


0.003 


1443 


AB014561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


3e-16 


3327136 


(AB014561) KIAA0661 protein 
[Homo sapiens] 


le-20 


1444 


• 

Z739S7 


Human UNA 
sequence from 
cosmid N120B6 on 
chromosome 22 
Contains ESTs, 
complete sequence 
[Homo sapiens] 


le-16 


<NONE> 


<NONE> 


<NONE> 


1445 


M53318 


Homo sapiens ala 
gene. 


ie-16 


<NONE> 


<NONE> 


<NONE> 


1446 


U44L03 


Human small GTP 
binding protein Rab9 
mRNA, complete cds 


Le-16 


1552584 


(Z80233) hypothetical protein 
Rv0029 


1.3 


1447 


AB014561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


9e-17 


3327136 


(AB014561) KIAA0661 protein 
Homo sapiens! 


2e-20 


144S 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-17 


<NONE> 


<NONE> 


<NONE> 


1449 


M76762 


Mus muscuius 
ribosomal protein (Ke 
3) gene, exons 1 to 5, 
and complete cds. 


le-17 


1073048 


pupR protein - Pseudomonas 
putida >gi|525260 


0.36 


1450 


D50561 


Human DNA, 
replication enhancing 
element (REE1) 


4c- 18 


126295 


LINE- 1 REVERSE 
TRANSCRIPTASE 
HOMOLOG 


0.78 


1451 


D 1643 I 


Human mRNA for 
heparoma-derived 
growth factor, 
complete cds 


4c- 18 


3242079 


(AJ0069S4) proline-rich protein 


0.0IS 
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Nearest 


Neighbor (BlastN vs. C 


}enbank) 


S Nearest Neighbor (BlastX vs. Non-Redundant Prnf^in^ 


SEQ 
LD 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1452 


AF088983 


Mus musculus heat 
shock protein hsp40-2 
mRNA, complete cds 


4e-18 


3873707 


(Z73102) Similarity to B.subtilis 
DNAJ protein 

w .uis AJ^bALjU); cDNA 
EST yk437aL5 comes from this 
gene [Caenorhabditis eleaans] 


9e-25 


1453 


U60205 


Human methyl sterol 
oxidase (ERG25) 
mRNA, complete cds 


3e-18 


<NONE> 


<NONE> 


<NONE> 


1454 


AF038177 


Homo sapiens clone 
23899 mRNA 
sequence 


le-18 


1360775 


G protein-coupled receptor 74 - 
equine herpesvirus 2 >gi[695246 
(U20824) G protein-coupled 
receptor [Equine herpesvirus 2] 


5.1 


1455 


ABO 14561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


le-18 




(ABO 14561) KIAA0661 protein 
Homo sapiens] 


le-21 


1456 


AB014561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


le-18 


3327136 


(ABU l-oo I) KIAAOooI protein 
Homo sapiens] 


le-22 


1457 


U34374 


Human tyrosine 
kinase TXK (txk) 
aene, exons 9 and 10. 


Ie-19 


<NONE> 


<NONE> 


<NONE> 


1458 


AB006969 


Homo sapiens 
hGAAl mRNA, 
complete cds 


le-19 


4151809 


(AF102855) synaptic SAPAP- 
interactins protein Svnamon 


0.19 


1459 


AB002293 


Human mRNA for 
KIAA0295 gene, 
partial cds 


Ie-19 


2224531 


(AB002293) KIAA0295 [Homo 
sapiens] 


6e-l7 ! 


1460 


Z59664 


H. sapiens CpG DNA, 
done 16819, reverse 
read cpgI6Sf9.rtla , 


5e-20 


3880251 


(ZS2055) predicted using 
Genefinder 


6.5 


1461 


( 

M73S37 i 


-luman modulator 
'ecognition factor 2 
MRF-2) mRNA, 
:omplete cds. 


5e-20 


i 

284313 I 


nodulator recognition factor 2 - 
luman factor 2 [Homo sapiens] 


0.019 



^7 3 
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SEQ 
ID 


Nearest 
ACCESSIONS 


Neighbor (BlastN vs. ( 
r DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


Dor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1462 


U24267 


Human pyrroIine-5- 

carboxylate 

dehydrogenase 


5e-20 


| 2506350 


DEL 1 A- 1-P V KKULlNb-5- 

CARBOXYLATE 
DEHYDROGENASE 
PRECURSOR (P5C 
DEHYDROGENASE) 
>gi|1353248 sapiens] 
>gi| 1353250 (U24267) pyrroline 
5-carboxylate dehydrogenase 
[Homo sapiens] 
>gi|1589585|prf]|221 1355A 
Delta l-pyrroline-5-carboxy late 
dehydrogenase (Homo sapiens] 


5e-04 


1463 


U 13262 


Mus musculus myelin 
gene expression 
factor 


4e-20 


536926 


(U13262) myelin gene 
expression factor [Mus 
musculus] 


3e-07 


1464 


U13262 


Mus musculus myelin 
gene expression 
factor 


4e-20 


3126878 


(AF061S32) M4 protein 
deletion mutant [Homo sapiens! 


le-08 


1465 


Z61239 


H.sapiens CpG DNA. 
clone 48fl0, forward 
read cpg48fl0.ftia . 


4e-20 


1669601 


(D88747) AR401 [Arabidopsis 
thaJiana] 


8e-19 


1466 


US9915 


Mus musculus 
junctional adhesion 
molecule (Jam) 
mRNA, complete cds 


le-20 


3462455 


(US99 15) junctional adhesion 
molecule [Mus musculus] 


7e-Il 


1467 


AF029071 


Gallus gallus p52 pro- 
apotouc protein 
mRNA. complete cds 


7e-22 


2599492 


(AF029071) p52 pro-apototic 
protein [Gallus sallus] 


ie-1-5 


1468 


M25636 


Figure 4. Nucleotide 
sequence of the 
□KS36 1.797 kb 
nsert. 


6e-22 


( 

1196398 


M21305) unknown protein 
Homo sapiens] 


0.65 


1469 


3 
i 

AB020655 i 


-lomo sapiens mRNA 
or KIAA0S4S 
protein, complete cds 


6e-22 


( 

4240325 f 


AB020725) KIAA091S protein 
Homo sapiens] 


le-19 



1JH 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












^KU<~OLl-AUfc~N AL^hA 




1470 


S80935 


chorionic 

gonadotropin beta 1 
(CG beta I) subunit 


5e-22 


115310 


HIV; CHAIN PRECURSOR" 

>gi|849l7|pir||A31S93 collagen 
alpha 1(IV) chain precursor - 
fruit fly (Drosophila 
melanogaster) melanogaster] 
>gi| 157078 (M96575) type IV 
collagen pro-collagen 
[Drosophila melanosaster] 


0.027 


1471 


AF053066 


Homo sapiens 
microsatellite 
D5S2926 sequence 


2e-22 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


3e-Q4 


1472 


U55177 


Danio rerio carbonic 
anhydrase homolog 
CAH-Z mRNA, 
complete cds 


2e-22 


3123190 


CARBONIC ANHYDRASE 
(CARBONATE 

DEHYDRATASE) >gi|2576335 
(U55 177) CAH-Z [Danio reriol 


Se-L4 


1473 


AF064250 


Gallus gallus 
ubiquitin specific 
protease 66 


2e-22 


2736064 


(AF016107) ubiquitin specific 
protease 4 1 [Gallus aallus] 


7e-37 


1474 


AF030880 


Homo sapiens 
penarin (rUo) 
mRNA, complete cds 


2e-22 


729367 


Lka^kOIEIN ^UOVVN- 
KLOULATED IN ADENOMA) 
>gi|2 1 35020|pir[| A47456 down- 
regulated in adenoma (DRA) - 
human >gi|291964 (L027S5) 
Nuclear localization signal at 
AA 569-573, 576-580, 579-583; 
acidic transcr. activ. domain 620- 
640,; homeobox motif 653-676 
Homo sapiens] 


4e-53 


1475 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


6e-23 


<NONE> 


<NONE> 


<NONE> 


1476 


X5739S 


Human rriRNA for 
pM5 protein 


3e-23 


107350 


Pm5 protein - human 
>gi|1335273|gnl|PID|e3624l 


le-04 


1477 


ABO 10998 < 


Rattus norvegicus 
PAD-Ri 1 mRNA for 
Peptidylarginine 
deiminase type I, 
:omp!ete cds 


2e-23 


<NONE> 


<NONE> 


<NONE> 


1478 


D 1087 1 : 


Human h NAT allele 
2-2 gene for 
irylamine N- 
lcetvl transferase 


2e-23 


( 

171200 


J04734) CDC6 protein 
Saccharomyces cerevisiae] 


9.S 


1479 


I 

D10S71 i 


4uman h NAT allele 
1-2 gene for 
irylamine N- 
tcetvl transferase 


2e-23 


( 

171200 f 


J04734) CDC6 protein 
Saccharomvces cerevisiae] 


S.3 



VI 5 
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SEC 
ID 


| Neares 
) J 

1 ACPFSSTOr 


t Neighbor (BlastN vs. 
sj OFSPRrPTTHM 

N J-/C.OV^XVii 1 1U IN 


Genbank) 
P VALUE 


| Nearest Neiar 
I ACCESSION 


ibor(BlasLX vs. Non-Redundant F 
DESCRIPTION 


'roteins) j 

p value! 


143C 


)[ AF024541 


Homo sapiens MLL- 
AF4 fusion protein 
mRNA, partial cds 


2e-23 


j 2136142 


serine/proline- rich t-fcL protein 
splice form 1 - human 


le-20 j 


1481 


1 LI 3773 


Human AF-4 mRNA 
complete cds. 


2e-23 


1 3063962 


(AF031404) MLL-AF4 fusion 
protein [Homo sapiens] 


le-20 J 


1482 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


8e-24 


f <NONE> 


<NONE> 


<NONE> I 


I 4 *© J 


J U/5467 


Drosophila 
melanogaster Rga anc 
Atu genes, complete 
cds 


i 

8e-24 


f 1658503 


(U75467) Atu [Drosophila 
melanoaaster] 


2e-37 I 


1 dXil 
1 i +o'+ 


U 1/076 


Human HepG2 partia 
cDNA, clone 
hmd5a09m5 


i 

7e-24 


j <NONE> 


<NONE> 


<NONE> 


14-OJ j 


Ar IUU694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-24 


1169643 


hMRhAMIDb- RELATED 
NEUROPEPTIDES 
PRECURSOR >gi|41620S 
(U03137) neuropeptide 
precursor FMRFamide- related 
peptide [Lvmnaea stagnalis] 


7e-10 j 


I486 


Ml 1167 


Human 28S 
ribosomal RNA aene. 


2e-24 


3875481 


(ZS1054) predicted using 
Gene finder; Similarity to UDP- 
glucoronosvltransferases 


'5.1 | 


1487 1 


AF I 00694 


Mus musculus 
Pontm:>2 mRNA, 
complete cds 


2e-24 


549173 


USP1 PROTEIN PRECURSOR 
>gi|169623 


1.2 




Ab003468 


Cloning vector 
pAP3neo DNA, 
complete sequence 


2e-24 | 


987050 


'X65335) lacZ gene product 
unidentified clonina vector] 


0.05S | 




< 

X03541 


Human mRNA of trk 
Dncogene > :: 
gb|I96186|I96186 
Sequence 23 from 
patent US 5734039 


2e-24 I 


< 
i 

325465 s 


'M74509) [Human endogenous 
•etrovirus type C oncovirus 
iequence.], gene product [Homo 
apiens] 


3e-04 J 


1490 


( 
I 

L81652 s 


-lomo sapiens 
subclone 2_gl 1 from 
>i H43) DNA 
equence 


2e-24 


r 

225047 E 


everse transcriptase related 
rotein [Homo sapiens! 


4e-I2 J 


1491 


r 

n 
s 
( 

U95760 c 


drosophila 
aelanogaster 
travvberry notch 
sno) mRNA, 
omplete cds 


2e-24 


0 

2078282 n 


U95760) Sno [Drosophila 
lelanoaaster] 


2e41 


1492 1 


\ 
P 

AF 100694 c 


lus musculus 
ontin52 mRNA, 
Dmplete cds 


8e 25 1 


C 
s^ 

26^3773 b 


■KF004S35) tyrocidine 
/nthetase 3 [Brevibacillus 
revis] 


8.6 1 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



14931 AB0024Q5 



Homo sapiens mRNA 
for LAK-4p 
complete cds 



8e-25 



2496822 



HYPOTHETICAL 127.3 KD ™ 
PROTEIN B0416.1 IN 
CHROMOSOME X >gi|746502 
(U235I6) B0416.1 gene product 
[Caenorhabditis elegans] 



1494 1 K03QQ2 



Human mRNA from 
chromosome 15 gene 
with homology to 
MHC-HLA-SB-1 
intron A. 



8e-25 



1514614 



(X92842) nuclear protein [Mus 
musculus] 



1495 1 U61232 



Human tubulin- 
folding cofactor E 
mRNA, complete cds 



7e-25 



1465772 



(U61232) cofactor E [Homo 
sapiens] 



1496 1 U1Q245 



Arabidopsis thaliana 
Col-O putative RNA 
helicase A mRNA, 
complete cds. 



5e-25 



1353239 



(UL0245) putative RNA 
helicase A [Arabidopsis 
thaliana] 



2e-05 



14971X89211 



H.sapiens DNA for 
endogenous retroviral 
like element 



14981 L81652 



Homo sapiens 
(subclone 2_gl 1 from 
Pi H43) DNA 
sequence 



3e-25 



2065210 



(Y12713) Pro-Pol-dUTPase 
polyprotein 



3e-25 



2072961 



(U93568) putative pl50 [Homo 
sapiens] 



5e-06 



5e-16 



1499 1 XS2S95 



H.sapiens mRNA for 
DLG2 



2e-25 



1500 1 M36654 



Mouse homeo box 

6 (Hox-2.6) mRNA 
complete cds 



2497511 



MAG UK P55 SUBFAMILY 
MEMBER 2 (MPP2 PROTEIN) 
(DISCS, LARGE HOMOLOG 

2) 



1501 



L36315 



1502 1 AB018281 



Mus musculus (clone 
pMLZ-1) zinc finger 
protein 



Homo sapiens mRNA 
for KIAA0738 
protein, complete cds 



9e-26 



3323169 



(AE001255)T. pallidum 
predicted coding region TP0854 



9e-26 



1806134 



(Z67747) zinc finger protein 
Mus musculus] 



9e-26 



728S31 



.ALU SUBFAMILY J 
WARNING ENTRY 



le-34 



1.9 



4e-05 



1503 1 AF017433 



Homo sapiens 
putative transcription 
factor CR53 



ZINC FINGER PROTEIN ZFP- 
9e-26 I 32199S5 29 



le-i: 
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SEC 
ID 


Nearest 

) 


Neiahbor (BlastN vs. 


Genbank) 

D \ t \ T T ITT 

P VALUE 


Nearest Neighbor (BlastX vs. Non-Redundant F 
ACCESSION | DESCRIPTION 


'roteins) 1 
P VALUE 


1504 


1 AC001225 


[Homo sapiens 
[(subclone 2_e6 from 
BAC H94) DNA 
sequence 


8e-26 


1 2653713 


(U91823) small S protein 
[Hepatitis B virusl 


43 1 


1 JUJ 


t\r 1UU694 


IMus musculus 
Pontin52 mRNA, 
IcomDlete cds 


8e-26 


1 283446 


Icyteine-rich surtace antigen 72, 
CRP72 - Giardia lamblia 
(fragment) 


34 1 


1506 


X94912 


|H. sapiens Pr22 gene 


3e-26 


j 728837 


!!!! ALU SUBFAMILY SQ 
WARNING ENTRY 


4e-09 I 


1507 


AF 100694 


IMus musculus 
Pontin52 mRNA, 
(complete cds 


2e-26 


I <NONE> 


1 <NONE> 


<NONE> 


1508 


U44103 


Human small GTP 
binding protein Rab9 
mRNA, complete cds 


le-26 


J 3327038 


(ABO 145 12) KIAA0612 protein 
[Homo sapiens] 


8 7 


1509 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-27 


4056454 


(ALU0399UJ Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


0.14 J 




AUUU12I2 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
9H1IN46 


9e-27 


126296 


LINE- 1 REVERSE 
TRANSCRIPTASE 
HOMOLOG protein 
[Nvcticebus coueans] 


0.012 j 




Ar(J27 131 1 


Mus musculus mucin 
glycoprotein MUC3 
mRNA, partial cds 


9e-27 J 


2589172 1 


(U76551) mucin Muc3 [Rattus 
norvegicus) 


2e-14 1 


1512 


1 

U49057 ( 


Rattus norvegicus 
CTD-binding SR-like 
protein rA9 mRNA, 
romplete cds 


5e 27 


1438534 i 


U49057) rA9 [Rattus 
lorveizicusl 


le 04 


1513 


1 

J03764 < 


-iuman, plasminogen 
ictivator inhibitor- 1 
rene, exons 2 to 9. 


3e-27 


<NONE> 


<NONE> 


<NONE> 


1514 


r 
c 

Z7S160 ( 


vl. musculus partial 
ochlear mRNA 
clone 2SD2) 


3e 27 


( 

1490362 r 


Z7S160) unknown [Mus 
nusculus] 


2e 05 I 


1515 


c 

Z64210 |r 


I. sapiens CpG DNA, 
lone 99b4. reverse 
ead cpg99b4.rtla . 


3e 27 1 


( 
S 
F 

225753S |r 


AB00453S) LIPOIC ACID 
YNTHETASE | 
RECURSOR(LIP-SYN) 
Schizosaccharomyces pombe] 


le-06 1 
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Nearest Neighbor (BlastN vs. Genbank) 


[ Nearest Neighbor (BlastX vs, Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1516 


L35659 


(subclone H8 6_h6 
from Pi 35 H5 CS) 
DNA sequence. 


le-27 


<NONE> 


<NONE> 


<NONE> 


1517 


AF 100694 


Mas muse ul us 
Pontin52 mRNA, 
complete cds 


le-27 


1644471 


(U72686) odorant receptor 4 
(Danio rerio] 


7.5 ! 


1518 


AF 100694 


Mus musculus 
Ponun52 mRNA, 
complete cds 


le-27 


2738388 


(AF003534) hypothetical 
protein 004L [Chilo iridescent 
virus! 


6.7 


1519 


AB009271 


Homo sapiens gene 
for BCNT. partial cds 


le-27 


3880909 


(AL032636) Y40B1B.3 
[Caenorhabditis elegansl 


4.6 


1520 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.85 


1521 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


121805 


ENDOGLUCAiNASE A 
PRECURSOR 


0.58 


1522 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


3722000 


(AF035323) survival motor 
neuron protein [Bos taurusl 


0.10 


1523 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


3328188 


(AF074902) laminin alpha chain 
Caenorhabditis elegansl 


0.083 


1524 


AF074382 


Homo sapiens IkB 
kinase samma subunit 


le-27 


3641280 


(AF074382) IkB kinase gamma 
subunit [Homo sapiens] 


0.041 


1525 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


4056454 


(AC0059yU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


6e-04 


1526 


L78778 


Homo sapiens 
(subclone 2_el0 from 
PI H49) DNA 
sequence 


le-27 


225047 


reverse transcriptase related 
protein [Homo sapiens! 


2e-09 


1527 


L03427 


Human zinc finger 
protein basonuclin 
mRNA. complete cds. 


le-27 


1488275 


(U59694) zinc finger protein 
basonuclin [Homo sapiens] 


9e-22 


152S 


U09954 


Human ribosomal 
protein L9 gene, 5' 
region and complete 
cds. 


4e-2S 


2257538 


(AB00453S) LIPOIC ACID 
SYNTHETASE 
PRECURSOR(LIP-SYN) 
Schizosaceharomyces pombe] 


2e-04 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor fBIastN vs. Genbank) 


_ Nearest Neighbor (BlastX v S . Non-Redundant Proteins* 


SEQ 
ID 


ACCESSIOIS 


[ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1529 


Z64210 


H.sapiens CpG DNA, 
clone 99b4. reverse 
read cpg99b4.rtla . 


4e-28 


3878570 


(Z46J8 i ) similar ro ltnmr ^^ > ^r^ 
synthase; cDNA EST yk283b6.: 
comes from this gene; cDNA 
EST yk283b6.5 comes from this 
gene; cDNA EST yk472f5.3 
comes from this gene; cDNA 
toi yk4-/ztD.3 comes from this 
gene; cDNA EST yk476e7.3... 


7e-ll 


1530 


U55177 


Danio rerio carbonic 
anhydrase homolog 
CAH-Z mRNA, 
complete cds 




"7 lotion 


CARBONIC ANHYDRASE 
(CARBONATE 

DEHYDRATASE) >gi|2576335 

/T TC C 1 *7'7\ /"*• a T_T T rT*"\ _ " i 

( LOO 177) CAH-Z [Damo rerio 


5e-2I 


1531 


D43682 


Human mRNA for 
very-long-chain acyl- 
CoA dehydrogenase 
(VLCAD), complete 
cds 


4e-28 


1351839 


ACYL-COA 

df uvnonr.PM a ctt vtcdv 

ucr\ I l^rv^Oc.lN/\otL, VcKl- 
LONG-CHAIN SPECIFIC 
PRECURSOR (VLCAD) 
>gi|930358 taurus] 


3e-27 


1532 


AFO 16591 


Homo sapiens 
survival motor neuron 
pseudogene. complete 
sequence 


3e-28 


728831 


WW .ALU SUBFAMILY J 
WARNING ENTRY 


3e-08 


1533 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
comnlprp rri<s 






! ! ! ! ALU SUBFAMILY SB 
WARNING ENTRY 


2.5 


1534 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


118588 


DEHYDRIN DHN3 
>gi|100035|pir||SlS139 dehydrin 
DHN3 - garden pea >gi[20709 
(X63063) pea dehydrin DHN3 
Pisum sativum! 


0.004 


1535 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
:omplete cds 


2e-2S 


] 
( 
I 

1169643 


FMRF AMIDE-RELATED ' 
NEUROPEPTIDES 
PRECURSOR >gi|416208 
'U03137) neuropeptide 
precursor FMRFamide- related 
peptide [Lvmnaea stagnaJis] 


6e-04 


1536 


I 
I 

AFL00694 c 


vlus musculus 
D ontin52 mRNA, 
•omplete cds 


2e-28 


r 

c 
c 

I 

C 

4056454 f 


AC005990) Contains repeated 
egion with similarity to 
lb|U43627 extensin (atExtl) 
>ene from Arabidopsis thaliana. 
ZSTs gb|Z34165 and gb|Z1878S 
ome from this gene. 
Arabidopsis thalianal 


9e-05 
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! Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


! DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACU0599U) Contains repeated 




1537 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gbjZ34l65 and gb.Z 18738 
come from this gene. 
[Arabidopsis thaliana] 


2e-06 


1538 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28' 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-09 


L539 


AF 100694 


Mus musculus 
Pontin52 mRNA ? 
complete cds 


2e-2S 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene trom Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18738 
come from this gene. 
[Arabidopsis thaliana] 


le-09 


1540 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene rrom rvruiDiuopsis Lnaiidna. 
ESTs gb|Z34165 and gbjZi87S8 
come from this gene. 
[Arabidopsis thaliana] 


5e-10 


154L 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


40564^4 


( A ^*fiO^Q*0O ) rnnrnins rr"^r k 'jred 

^AL \JVJ-J J J\J ) \ U I 1 L«J. lllJ i ^, fj tJ. 11* U 

region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z1878S 
come from this gene. 
[Arabidopsis thaliana] 


le- 1 1 


1542 


AF 100694 


Mus musculus 
complete cds 


2e-28 


3157926 


(AC00213 I) Strong similarity to 
extensin-like protein gb|Z34465 

f"rr\ m "7 \^-\ rn'i\/c f i nhirlnncic 

irufn z.cJ iTiayb. [.-vi auiLiup^ib 

thaliana] 


8e-L2 


1543 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1544 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1545 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 



73 I 
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Nearest Neishbor ( BlastN vs. Genbank) 


S__ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1546 


AF100694 


Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1 j4 / 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1548 


1 AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 




AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


■ <NONE> 


<NONE> 


<NONE> 


1550 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


155 1 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1552 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


llO J 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 ! 


<NONE> 


<NONE> 


<NONE> 


15^4 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1555 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1556 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1557 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 




AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1559 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1560 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1561 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONZ> 
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Nearest Neighbor CBIastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus muse ui us 










1562 


AF100694 


Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


156j 


AF 100694 


Mus muse ul us 
Pontin52 mRNA, 
complete cds 


Le-28 


<NONE> 


<NONE> 


<NONE> 


1564 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1565 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


, <NONE> 


<NONE> 


<NONE> 


1566 


M87708 


Human simple repeat 
polymorphism. 


le-28 


<NONE> 


<NONE> 


<NONE> 


1567 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1568 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


3924779 


B; cDNA EST yk450d8.5 comes 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes fr... 

>gi|39248S I |gnIjPID|e 1 354569 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes from... 


3.0 


1569 


AF 100694 < 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


1169643 


FMRF AMIDE-RELATED 
NEUROPEPTIDES 
PRECURSOR >gi|4 16208 
(U03137) neuropeptide 
precursor FMRFamide-related 
Deptide [Lvmnaea stagnaiis] 


0.66 
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Nearest 


Neighbor (BlastN vs. Genbank) 


I Nearest Nemhbor (BlastX vs. Non-Redundant Proteins) ! 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1570 


AF 1 00694 


• 

Mus musculus 
Pontin52 mRNA, 
complete cds 


- • 
le-28 


3924779 


- D, lDNA EST >l4jQU3.j Lumtr 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes fr... 

>gi|3924881|gnl|PCD|el354569 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST vk355e4 5 comes from this 
gene; cDNA EST yk224f4.5 
comes from... 


0.65 


157L 


AF 100694 


Mus musciiius 
Pontin52 mRNA, 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.49 


1572 


AF 100694 


Mas musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.49 


1573 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-2S 


283446 


cyteine-rich surface antigen 72, 
CRP72 - Giardia lamblia 
(frasment) 


0.45 


1574 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2498937 


SPERMATOPHORIN SP23 
PRECURSOR mealworm 
>gi| 161725 (M92928) structural 
protein 


0.33 


1575 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


1492050 


(U60315) MCI07L [Molluscum 
contadosum virus subtype 1] 


0.1S 


1576 


AF 100694 


Mus musculus j 
Pontin52 mRNA, 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


O.OSS 


1577 


] 

AF 1 00694 ( 


VIus musculus 
Pontin52 mRNA, 
:omplete cds 


le-28 


] 
( 

118588 


DEHYDRIN DHN3 
>gi|100035|pir||S18139 dehydrin 
DHN3 - garden pea >gi|20709 
X63063) pea dehydrin DHN3 
Pisum sativum] 


0.01S 


157S 


: 

AF 100694 t 


vrus musculus 
D ontin52 mRNA, 
romplete cds 


le-28 


I 

( 

11S5SS 


DEHYDRIN DHN3 
>gi|l00035|pir||S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
X63063) pea dehydrin DHN3 
Pisum sativum] 


0.016 
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Nearest 


Neighbor (BlastN vs. Genbank) 


J Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












DEHYDRIN DHN3 




1579 


AF100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


118588 


>gi|100035|pir||S18139 dehydrir 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativuml 


i 

0.012 


1580 


AF 1 00694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACOO^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ 18788 
come from this gene. 
[Arabidopsis thaliana] 


0.010 


1581 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


118588 


DEHYDRIN DHN3 
>gi|100035|pir||S18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
Pisum sativuml 


0.002 


1582 


API 00694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


1169643 


FVIRF AMIDE-RELATED 
NEUROPEPTIDES 
PRECURSOR >gi|4 16208 
(U03I37) neuropeptide 
precursor FMRFamide-related 
peptide [Lvmnaea stasnalis] 


0.002 


1583 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 ! 


(ACU05990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34 1 65 and gbjZ 1 8788 
come from this gene. 
[Arabidopsis thaliana) 


0.002 


1584 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 
complete cds 


le-28 


118588 


DEHYDRIN DHN3 
>gillOO035|pirj|S18139 dehydrin 
DHN3 - garden pea >gii20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


0.002 


1585 


] 
1 

AF 1 00694 t 


VIus muscuius 
Pontin52 mRNA, 
:omplete cds 


le-28 


J 

4056454 


,AL'0Od99O) Contains repeated 
-eg ion with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb[Z34165 and gb;Z1878S 
;ome from this gene. 
Arabidopsis thaliana] 


0.002 


1586 


I 
1 

AF 1 00694 c 


VI us muscuius 
5 ontin52 mRNA, 
•omplete cds 


le-28 


I 

( 

118588 


DEHYDRIN DHN3 
>gi| 1 00035jpir||S IS 139 dehydrin 
3HN3 - garden pea >gi|20709 
X63063) pea dehydrin DHN3 
Pisum sativum! 


0.00 i 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACOOD990) Contains repeated 




1587 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|2 18788 
come from this gene. 
[Arabidopsis thalianal 


0.001 


1588 


AF100694 


Mus musculus 
Pontin52 rnRNA, 
complete cds 


le-28 


4056454 


(AC00599O) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
come from this gene. 
Arabidopsis thaliana] 


6e-04 


1589 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC00D990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 andgb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


5e-04 


1590 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACl)0:>y90) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thalianal 


5e-04 


1591 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


118588 


DEHYDRIN DHN3 
>gi|100035|pir||S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


2e-04 


1592 


AF 100694 ( 


VIus musculus 
Pontin52 mRNA, 
:omplete cds 


le-28 


4056454 


(AC00:>990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
:ome from this gene. 
"Arabidopsis thaliana] 


2e-04 


1593 


I 
I 

AF 100694 c 


VIus musculus 
3 ontin52 mRNA, 
;omplete cds 


le-28 


r 

I 
I 
I 
c 

4056454 [ 


AC005990) Contains repeated 
■egion with similarity to 
zbjU43627 extensin (atExtl) 
zene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
'Ome from this gene. 
Arabidopsis thaliana] 


5e-05 
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SEQ 
ID 


1 Nearest 
1 ACCESSION* 


Neighbor (BiastN vs. ( 
I DESCRIPTION 


Uenbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


DESCRIPTION 


P VALUE 


1594 


AF 100694 


Mus muse ul us 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACU0599U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


5e 05 1 


1595 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUlbyyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


le-05 


1596 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUO5990] Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


le-05 I 


1597 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUU599U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Zl8788 
come from this gene. 
[Arabidopsis thaliana] 


9e-06 1 


1598 


AF 100694 


VIus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC00599U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


6e-06 I 


1599 


I 
3 

AF 100694 c 


Vlus musculus 
3 ontin52 mRNA. 
complete cds 


le-28 


J 

i 
i 

c 

4056454 


,ACUO:>990) Contains repeated 
V51U11 wuii b inn lui 1 iy ID 

2b|U43627 extensin (atExtl) 
zene from Arabidopsis thaliana. 
ESTs gb|Z34!65 and gb|Z 18788 
l ome from this gene. 
Arabidopsis thaliana] 


5e 06 


1600 1 


r 
i 

AF 100694 c 


vlus musculus 
5 ontin52 mRNA, 
omplete cds 


le-28 


I 
F 

544357 F 


WA-BINDING PROTEIN 
: US/TLS protein [human, 
'eptide. 526 aa] [Homo sapiens] 


4e-06 1 
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Nearest Neiahbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Son- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACU0^990) Contains repeated 




1601 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-06 


1602 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28. 


4056454 


(AC005SJ90) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTsgb|234I65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-06 


1603 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


9e-07 


1604 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[.Arabidopsis thaliana] 


8e-07 


1605 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


1169643 


FMRFAMIDE- RELATED 
NEUROPEPTIDES 
PRECURSOR >gi|41620S 
(U03137) neuropeptide 
precursor FMRFamide-related 
peptide [Lymnaea stagnalis] 


7e-07 ' 


1606 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z187S8 
come from this gene. 
'Arabidopsis thaliana] 


6e-07 


1607 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACU0D990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb]Z34165 and gb!Z187SS 
come from this gene. 
[Arabidopsis thaliana] 


5e-0~ 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












Contains repeated 




1608 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|2 18788 
come from this gene. 
[Arabidopsis thaliana] 


3e-07 


1609 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ 18788 
come from this gene. 
[Arabidopsis thaliana] 


. le-07 


1610 


AF1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUODyVO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


le-07 


1611 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACU0:>990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
come from this gene. 
[Arabidopsis thaliana] 


7e-0S 


1612 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC(Xb990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|ZI878S 
come from this gene. 
[Arabidopsis thaliana] 


2e-08 


1613 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ1378S 
come from this gene. 
[Arabidopsis thaliana] 


6e-09 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACUOoyyOj Contains repeated 




L6L4 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
compiete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|2 18788 
come from this gene. 
(Arabidopsis thaliana] 


5e-09 


1615 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-23 


4056454 


(ACU05990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thalianal 


4e-09 


1616 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34l65 and gbjZ18788 
come from this gene. 
[Arabidopsis thaliana] 


7e-l0 


1617 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
come from this gene. 
[Arabidopsis thaliana] 


6e-I0 


1618 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


5e-10 


1619 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl > 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb]Z18788 
come from this gene. 
[Arabidopsis thaliana] 


4e-10 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proline) 


SEQ 
ID 


ACCESSIONS 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AC00D990) Contains repeated 




1620 


AFL 00694 


Mus muse ul us 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|234165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-10 


1621 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28" 


4056454 


(ALU03990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis "thaliana. 
ESTs gb|234165 and gb|2 18788 
come from this gene. 
"Arabidopsis thaliana] 


5e-ll 


1622 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC0O599O) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[.Arabidopsis thalianal 


2e-12 


1623 


AF032896 


Petromyzon marinus 
polyadenylate binding 
protein 


le-28 


1082703 


polyadenylate binding protein II 
luman 


2e-27 


1624 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-29 


118588 


DEHYDRIN DHN3 
>gi|l00035(pirj|SlS139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


0.013 


1625 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-29 


2133579 


spermatophorin Sp23 - yellow 
mealworm mo li tor I 


6e-04 


1626 


AF 100694 ( 


Mus musculus 
Pontin52 mRNA, 
:ompIete cds 


9e-29 


■ 

! 

3876465 c 


(Z81071) predicted using 
Genefinder; Similarity to 
Human small nuclear 
ribonucleoprotein E cDNA EST 
yk375g7.5 comes from this 
gene; cDNA EST yk435r3.3 
;omes from this sen... 


9e-06 


1627 


i 
I 

AF 100694 c 


vlus musculus 
3 ontin52 mRNA, 
•omplete cds 


8e-29 


( 
r 

c 
c 

c 

I 

c 

4056454 \ 


ACUU5990) Contains repeated 
egion with similarity to 
;b|U43627 extensin (atExtl) 
lene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb;Z187SS 
ome from this gene. 
Arabidopsis thaliana] | 


2e-06 
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1 Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












a£>P-R16OSYLaT10N 




1628 




Mus musculus 
Pontin52 mRNA, 
complete cds 


^fe-zy 


72os8j 


FACTOR 3 rruit fly (Drosophila 
melanogaster) >gi|507234 
(L25063) ADP ribosylation 
factor 3 [Drosophila 
melanogaster] 


0.016 


1629 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-29 


544357 


RNA-BINDING PROTEIN 
FUS/TLS protein [human. 
Peptide, 526 aa] [Homo sapiens] 


2e-07 


1630 


AF 100694 


Pontin52 mRNA, 
complete cds 


4e-29 


4056454 


(ALUU599U) Contains repeated 
region with similarity to 
gD|U4joz/ extensin (atcxti) 
gene from Arabidopsis thaliana. 
ESTs gb|Zj>4165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


le-08 


1631 


D43682 


Human mRNA for 
very-long-chain acyl- 

(VLCAD), complete 
cds 


4e-29 


1168287 


ATVT TnA 1 

DEHYDROGENASE, VERY- 
LONG-CHAIN SPECIFIC 
PRECURSOR (VLCAD) 
dehydrogenase precursor - rat 
Acyl-CoA dehydrogenase 
[Rattus norveaicus] 


6e-37 


1632 


I u / oou 


M. tuberculosis accBC 
gene 


4e-_V 


21 13935 


(Z95556) accDl 
"Mycobacterium tuberculosis] 


3e-47 


1633 


AJJJO / 


Human alpha-satellite 
DNA from clone 

piKA--. ! 


ie-zy 


<NONE> 


<NONE> 


<NONE> 


1634 


LSI 866 


Homo sapiens 
(subclone l_fl from 
PI H54) DNA 
sequence 


le-29 


<NONE> 


<NONE> • 


<NONE> 


1635 


S75940 


< Alu repeats, clone 
52H10} [human, 
colonic mucosa. 
Genomic, 943 nt] 


le-29 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


le-07 


1636 


AB001907 


Homo sapiens 
PACE4 sene. exon 13 


le-29 


728S31 


ALU SUB FAMILY J 
WARNING ENTRY 


2e-09 : 


1637 


AF077003 i 


VIus musculus SH3 
domain-containing 
idapter protein 
tiRNA. complete cds 


5e-30 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiahbor {BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AC005990) Contains repeated 




1638 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-30 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


3e-10 


1639 


M27072 


Xenopus laevis 
poly(A)-binding 
protein (ABP-EF) 
mRNA, complete cds. 


4e-30 


' 1352709 


POLYADENYLATE- 
BINDING PROTEIN 
polyadenylate-binding protein - 
African clawed frog laevisl 


5e-21 


1640 


X58386 


B.taurus mRNA for 
bovine vacuolar 
ATPase subunit A 


2e-30 


2773154 


(AF039573) abscisic acid- and 
stress-inducible protein 


4.3 


1641 


Y07660 


M.tuberculosis accBC 
gene 


le-30 


2113935 


(Z95556) accDl 
[Mycobacterium tuberculosis] 


4e-47 


1642 


AJ236940 


Sus scrota mRNA for 
hypothetical protein 
(5': clone 7C4) 


4e-3l 


4102021 


(AF007561) delta 6-desaturase 
[Boraso officinalis] 


7.4 


1643 


AF039400 


Homo sapiens 
calcium-dependent 
chloride channel- 1 
(hCLCAl) mRNA, 
complete cds 


2e-31 


3721912 


(AB017156) gob-5 [Mus 
musculus] 


7e-08 


1644 


L77036 


Homo sapiens 
(subclone 5_d9 from 
PI H19) DNA 
sequence. 


le-31 


461663 


BOMB YX1N B-2 HOMO LOG 
PRECURSOR silkmoth 
>gi|2 17385|gnl|PID|dl003528 
(D 13924) Samia bombyxin 
homolos B-2 [Samia cynthia] 


1.1 


1645 


X61971 


H. sapiens mRNA for 
macropain subunit 
delta 


le-31 


296734 


(X6I971) macropain subunit 
delta [Homo sapiens] 


3e-06 


1646 


L00016 


human mitochondrial 
trnas and partial 
proteins 4 8c 5; 
histidyl-, seryl-, 
leucyi-trna genes; 
urf4 and urf5 
(partial). 


5e-32 


4056454 


(AC00n990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


0.002 


1647 


M178S7 


Human acidic 
ribosornal 
phosphoprotein P2 
mRNA, complete cds. 


5e-32 


4056454 


(ACOODyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


le-05 



^3 
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Nearest Neishbor (BlastN vs. Gertbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human muogen- 










1659 


U53446 


responsive 

phosphoprotein DOC- 
2 mRNA, complete 
cds. 


6e-34 


3395443 


(AC004683) putative 
ammonium transporter. 3* partial 


4.7 


1660 


AFO 13988 


Homo sapiens serine 
protease mRNA, 
complete cds 


4e-34 


' 2507226 


PR OTF TN-T YR OS INK 
PHOSPHATASE EPSILON 
PRECURSOR (R-PTP- 
EPSILON) >gi| 1439605 
(U62387) protein tyrosine 
phosphatase-e [Mus musculusl 


3.2 


100 l 


U53446 


riuman mitogen- 
responsive 

phosphoprotein DOC- 
2 mRNA, complete 
cds. 


2e-34 


104757 


JLrir luu protein precursor - 
chicken >gi[2 12254 aallus] 


1.6 


1662 


AJ233632 


Homo sapiens 
endogenous retroviral 
sequence ERV-L pol 
gene, clone ERV-L 
Human6 


2e-34 


3860513 


(AJ233597) reverse 
transcriptase [Mus famulus] 


4e-10 


1663 


AF086310 


Homo sapiens full 
length insert cDNA 
clone ZD5IF08 


8e-35 


2947070 


pilliAllvc OCi/ 1 ill 

protein kinase [Arabidopsis 
thaliana] 


2.3 


1664 


XI 7206 


Human mRNA for 
LLRep3 


3e-35 


730652 


40S RIHOSOMAL PRO 1 fclN 
S2 (STRINGS OF PEARLS 
PROTEIN) 

>gi|l085l5S|pir||S50325 

ri hncnmil i~\rr\ri=»in - fn 1 it f"l V 

riDOburriLii jjiulciii o— . nun uy 
(Drosophila melanogaster) 
melanogaster] >gi|5 15972 
(U01335) ribosomal protein S2 


2e-10 


1665 


AB011137 


Homo sapiens mRNA 
for KIAA0565 
protein, complete cds 


3e-35 


3043654 


(AB011137) KIAA0565 protein 
[Homo sapiensl 


2e-16 


1666 


U62801 


Human protease M j 
mRNA. complete cds 


2e-35 


392923 1 


(AF091247) potassium channel 
[Rattus norvesicusl 


1.0 


1667 


AF020760 


Homo sapiens serine 
protease (Omi) 
mRNA. complete cds 


le-35 


2738915 


(AF020760) serine protease 
[Homo sapiensl 


9e-l4 
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Nearest Neighbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1668 




sequence from 
cosmid U235H3 on 
chromosome X 


8e-36 


1 196432 


(M22333) unknown protein 
fHnmn ^aniensl 


3e-10 


1669 


X06778 


Rabbit 18S rRNA 


7e-36 


118588 


DEHYDR1N DHN3 
>gi|100035|pir||S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativuml 


0.011 


1670 


AB 007962 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA0493 


3e-36 


3329243 


(AE001350) hypothetical 
protein [Chlamydia trachomatis] 


3.1 


167 L 




Human DNA 

<5pnnpnf*p From 

cosmid U65 A4, 
between markers 
DXS366 and DXS87 
on cnroiTiuouiiiC 


3e-36 


141 103 


HYPOTHETICAL PROTEIN 
ORF- 1 137 mouse 


0.038 


1672 


Z81014 


Human DNA 
sequence from 

rntffliH T*^"SA4 

between markers 
DXS366 and DXSS7 
on chromosome X * 


3e-36 


198651 


(M29325) ORP1 [Mus 
musculus] 


0.006 


1673 


U49082 


Human transporter 

nrnfpin /rt 1 7^ mR /-\ 
JJIOLCMI / ) lILtxl^rA. 

complete cds 


3e-36 


! 1840045 


(U49082) transporter protein 
[Homo sapiens] 


2e-15 


1674 




Human transcription 
factor SP1 mRNA, 3' 
end. 




' 477 m 


HF-l regulatory element binding 

nrntpin - mf 

Jjl ULC 111 lul 


2e-3 1 


1675 


AB007934 


Homo sapiens mRNA 
for KIAA0465 
protein, partial cds 


le-36 


3413892 


(AB007934) KIAA0465 protein 
[Homo sapiens] 


4e-37 


1676 


M34857 


Mouse Hox-2.5 
mRNA. 


9e-37 


106296 


homeotic protein Hox B9 - 
human (fragment) 


0.15 \ 


1677 


L35657 


Homo sapiens 
(subclone HS 5_al0 
from PI 35 H5 C8) 
DNA sequence. 


9e-37 


2072960 


(U9356S) p40 [Homo sapiens] 


3e-05 


1678 


XS0240 


H.sapiens 
endogenous 
retrovirus HERV- 
KC4 DNA 


8e-37 


4185944 


(Y17S33) env protein [Human 
endogenous retrovirus K] 


le-15 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1679 




sequence from 
cosmid U235H3 on 
chromosome X 


"C'JO 




hypothetical protein (L1H 3* 

iCi£1U1W I1U111JI1 




1680 


X97303 


H.sapiens mRNA for 
Pt°-12 protein 


4e-38 


466044 


HtPU'lHhllCJAL ZINL 
FINGER PROTEIN ZK686.4 
IN CHROMOSOME III 
>gi|630780|pir||S44909 ZK686.4 
protein - Caenorhabditis elegans 
>gi|304346 (L17337) coded for 
by C. elegans cDNAs 
GenBank:MS8869 and TO 1933; 
putative [Caenorhabditis 
elegans] 


3e-37 


1681 


Y08999 


H.sapiens mRNA for 
Sop2p-like protein 


3e-38 


3334339 


SOP2-LIKE PROTEIN 


5e-06 


1682 


Z628S7 


T_T cnn ,a nc Cr\C\ DMA 

ri. sapiens v^pv_j 
clone 74g6, forward 
read cpg74iz6.ftla . 


2e-38 


1245686 


rtlS 1 1 SH F^6D4 ^ oene 
product [Caenorhabditis 
elegansl 


0.19 


1683 


U35032 


Human endogenous 
retrovirus clone 
c5.il, HER V-H 
multiply spliced 

ciiVirrprmmif P'lnPr 

protease and integrase 
region mRNA, partial 
cds 


le-38 


59977 


(Z 143 10) tripartite fusion 
transcript PLA2L [Human 
endogenous retrovirus] 


le-06 


1684 




Human mRNA for 
KIAA0220 gene, 
partial cds 


i C*JO 




(AC002544) Unknown gene 
product splice form-2 [Homo 
sapiens] 


8e-ll S 


1685 


M31013 


Human nonmuscle 
myuMn riCiivy i^iiam 
(NMHC) mRNA, 3' 
end. 


le-38 


4115748 


(AB022023) nonmuscle myosin 
heavv chain B 


2e-ll 


1686 


AFO06087 


Homo sapiens Arp2/3 
protein complex 
subunit p20-Arc 
(ARC20) mRNA, 
complete cds 


4e-39 


<NONE> 


<NONE> 


<NONE> 


1687 


X58374 


D.melanogaster crn 
mRNA 


4e-39 


2655888 


(AL009171) 62D9.a 
[Drosophila melanogaster] 


4e-42 


16S8 


D85815 


Human DNA for 
rhoHPl. complete cds 


le-39 


134080 


GTP- BIN DING PROTEIN 
TC10 ras-like protein [Homo 
sapiens] 


3e-26 



4 Q ' 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1689 


U49057 


Rattus norvegicus 
CTD-binding SR-Iike 
protein rA9 mRNA, 
complete cds 


4e-40 


1438534 


(U49057) rA9 [Rattus 
norvegicus) 


5e-05 


1690 


Y08999 


H.sapiens mRNA for 
Sop2p-Iike protein 


4e-40 


3334339 


SOP2-LIKE PROTEIN 


! 9e-08 


1691 


AB002293 


Human mRNA for 
KIAA0295 gene, 
partial cds 


4e-40 


2224531 


(AB 002293) KIAA0295 [Homo 
sapiens] 


le-30 


1692 


AF086222 


Homo sapiens full 
length insert cDNA 
clone ZC66E08 


le-40 


2829669 


DOUBLE- STRANDED RNA- 
ortHrlL hui 1 Aot 1 
(DSRNA ADENOSINE 
DEAMINASE) (RNA 
EDITING ENZYME 1) 
>gi|1707502|gnl|PID|e254627 
(X99227) double-stranded RNA- 
specific editase (Homo sapiens] 
editase 1 hREDl-L [Homo 
sapiens] >gi|2039300 (U76421) 
dsRNA adenosine deaminase 
DRADA2b [Homo sapiens] 


0.61 


1693 


AF044127 


Homo sapiens 
peroxisomal short- 
chain alcohol 
dehydrogenase 
(SCAD-SRL) mRNA, 
complete cds 


Ie-40 \ 


4105190 


(AF044I27) peroxisomal short- 
chain alcohol dehydrogenase 


2e-06 \ 


1694 


U36778 


Mus rnusculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|33808S 
(M74558) SIL 


6e-23 


1695 


U36778 


Mus rnusculus Sil 
mRNA, complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
(M74558) SIL 


6e-23 


1696 


U36778 


Mus rnusculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
(M74558) SIL 


5e-23 


1697 


U36778 


Mus rnusculus Sil 
mRNA, complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
(M7455S) SIL 


5e-23 


1698 


AB01S285 


Homo sapiens mRNA 
for KIAA0742 
protein, partial cds 


le-40 


3882205 1 


(ABO 18285) KIAA0742 protein 
Homo sapiens] 


6e-31 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) j 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION - 


DESCRIPTION 


P VALUE 












ATP- BINDING CASSETTE 




1699 


X75927 


M.musculus abc2 
mRNA 


le-40 


728773 


TRANSPORTER i ABC1 - 
human >gi|495257 (X75926) 
abcl [Mus musculus] 


3e-37 


1700 


AF038200 


Homo sapiens clone 
23954 mRNA 
sequence 


Se-41 


3211975 


(AF068195) putative 
glialblastoma cell differentiation- 
related protein [Homo sapiens] 


5e-14 


1701 




Human estrogen 
sulfo transferase 
(STE) gene, exon S 

dnci cuijipicic wus 


4e-41 




<NONE> 


<NONE> 


1702 


AF026543 


Homo sapiens 
branched chain alpha- 
ketoacid 

dehydrogenase kinase 
precursor, mRNA, 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


2e-41 


J 

3182923 


[3-METHYL-2- 
OXOBUTANOATE 
DEHYDROGENASE 
(LIPO AMIDE)] KINASE 
PRECURSOR alpha-ketoacid 
dehydrogenase kinase precursor 
[Homo sapiens] 


2e-09 


1703 


Y07660 


M. tuberculosis accBC 
gene 


2e-41 


- 

465847 


HYPOTHETICAL 66.5 KD 
PROTEIN F02A9.5 IN 
CHROMOSOME III 
>gi|280542|pir||S28313 
hypothetical protein F02A9.5 - 
Caenorhabditis elegans 
Genefinder: similar to Propionyl- 
CoA carboxylase beta chain; 
cDNA EST EMBL:M89013 
comes from this gene; cDNA 
EST EMBL:D28069 comes 
from this °ene* cDNA EST 
EMBL:D2S063 comes from this 
gene; cDNA EST ... 


3e-3S 


1704 


AG001237 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
9H11N46 


le-41 


106322 


hypothetical protein (L1H 3' 
resion) - human 


5e-09 


1705 


AB007934 


Homo sapiens mRNA 
for KIAA0465 
protein, partial cds 


le-41 


3413892 


(AB00793-n KIAA0465 protein 
[Homo sapiensl 


3e-12 


1706 


AF055029 


Homo sapiens clone 
24711 mRNA 
sequence 


5e-42 


3250681 


(AL0244S6^> putative protein 


2.2 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


Jteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












1- " 




i /U / 


Z49747 


O.cuniculus mRNA 
for phospholipase C 


5e-42 


130227 


PHOSPHA 1 ID YLlNUSl'l OL- 
4,5-BISPHOSPHATE 
PHOSPHODIESTERASE 
DFT TA 1 CPl C-DELTA-H 
(PHOSPHOLIPASE C-DELTA- 
1) (PLC-III) >gi| 163538 

^VJOnA'^Q^ nhncnhnl in'l^f C-TTT 

^jvi^uojcjj pnospuunp«ioc 111 
[Bos taurus] 


5e-36 


1 /Uo 


M93651 


Human set gene, 
complete cds. 


2e-42 


<NONE> 


<NONE> 


<NONE> 


1709 


AJ236940 


Sus scrota mRNA tor 
hypothetical protein 
(5': clone 7C4) 


2e-42 


2062403 


(U79010) delta 6 desaturase 
[Boraao officinalis] 


8.5 


1710 


J03634 


Human erythroid 
differentiation protein 
mRNA 


2e-42 


1708436 


INHIBIN BETA A CHAIN 
PRECURSOR 


2e-10 


1711 


AJ223777 


Mus musculus mRNA 
for striatin 


6e-43 


1 A C\ \C\ ! *7 

24949 17 


STRIATIN 

>2l{14y3/ / j)l2nl|rlJU|ezj i +l jo 


2e-32 


1712 


AF01641I 


Homo sapiens 
potassium channel 
subunitKCNA3.1B 


2e-43 


2708514 


( A Pfl 1 A_L I 1 *i VC\i A> 1 R [Homo 
sapiens! 


3e-13 


1713 


AC001443 


Homo sapiens 
(subclone 2_fl0 from 
BAC 2913 


le-43 


111814 


hypothetical protein 3 - rat 
>si|565S9 


2e-06 


1714 


X82S95 


H. sapiens mRNA for 
DLG2 


6e-44 


24975 1 1 


MAGUK P55 SUBFAMILY 
MEMBER 2 { MPP2 PROTEIN) 
(DISCS. LARGE HOMOLOG 
2) 


6e-52 


1715 


U I 7077 


Human BENE 
mRNA. partial cds. 


3e-44 


53912 


(X57960) nbosomal protein L7 
[Mus musculusl >gi|55489 


8e-30 


1716 


a Tallinn 


Homo sapiens mRNA 
ror t >j^w.*j.ji prLHcui 


2e-44 


<N0NE> 


<NONE> 


<NONE> 


1717 


J03634 


Human erythroid 
differentiation protein 
mRNA 


2e-44 


124279 


INHIBIN Bhl A A LttAirs 
PRECURSOR PROTEIN) 
(EDF) >gi|S7936ipir||B2424S 
inhibin beta-A chain precursor - 
human >gi|lSl947 (J03634) 
erythroid differentiation protein 
precursor [Homo sapiens] 
sapiens! 

>gi|22bS50|prr]|160S260B 
inhibin betaA [Homo sapiensl 


0.73 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) \ 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1718 


AB0145L3 


Homo sapiens mRNA 
forK!AA0613 
protein, complete cds 


7e-45 


f C\ 1 1 C.I o 

19 11548 


(S80864) cytochrome c-iike 
polypeptide sapiens] 


l.O ■ 


1710 


X76808 


H.sapiens genomic 
DNA clone d2 


7e-45 


868201 


\\j£y similar io auenyiaie 
cyclase [Caenorhabditis elegansl 


2e-09 


1720 


AB021288 


Homo sapiens mRNA 
for beta 2- 
microglobulin, 
complete cds 


2e-45 


2465521 


polymerase [Cryptosporidium 
parvuml 


0.15 


1721 


X63468 


H.sapiens mRNA for 
transcription factor 
TFIEE alpha 


8e-4'6 


<NONE> 


<NONE> 


<NONE> 


I / J. J. 


AF0L9226 


Homo sapiens D2-2 
mRNA, 3'UTR 


7e-46 


<NONE> 


<NONE> 


<NONE> 


1723 


D3L764 


Human mRNA for 
KIAA0064 gene, 
complete cds 


2e-46 


3123050 


HYPOTHETICAL PROTEIN 
KIAA0064 ' 


Le-15 


1724 


K02774 


Human MHC class il 
ri-L A.- D K - d e La- ps 1 
(DW4/DR4) 
pseudogene, exons 
3,4, 5,6 T clones cos II- 
3301 and cosII-801. 


le-4o 


1 1 QCQ/1 A 


(Y17S34) gag protein [Human 
endogenous retrovirus K] 




1725 


X92109 


H.sapiens hcglX gene 


9e-47 1 


2498185 


BRIDE OF SEVENLfcSS 
PROTEIN PRECURSOR 
"soil 1 070 i ftfvnirll A47*S50 bride 
of sevenless precursor - fruit fly 
(Drosophila virilis) >gi|290216 
virilisl 


1.4 


1726 


X93334 


H.sapiens 

mitochondrial DNA ? 
complete genome 


8e-47 


128753 


OXIDOREDUCTASE CHAIN 
4>gi|86696|pir||A00435 NADH 
dehydrogenase Cubiquinone) 


4e-15 


1727 


M85145 


^Tiimin tumnr 
riujiiuii tunivji 

necrosis factor 
receptor, 3' flank. 


3e-47 


<NONE> 


<NONE> 


<NONE> 


L728 


XS0240 


H.sapiens 
endogenous 
retrovirus HERV- 
KC4 DNA 


3e-47 


4185944 


(YL7S33) env protein [Human 
endogenous retrovirus K] 


7e-13 


1729 


Z63594 


H.sapiens CpG DNA, 
clone S7t'9. forward 
read cpaS7t*9.ftla . 


le-47 


3322743 


(AE0OL222) T. pallidum 
predicted coding region TP0454 


2.4 
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Nearest Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) . 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






R.rattus rnRNA for 










I f JU 


X62295 


vascular type- 1 
angiotensin II 
receptor 


4e-48 


1209756 


(U4jo^yj integral membrane 
protein [Beta vulgaris] 


le-07 


1731 


M85145 


Human tumor 
necrosis factor 
receptor, 3' flank. 


3e-4a 


<UNUIN fc> 






1732 


AB020712 


Homo sapiens rnRNA 
for KIAA0905 
protein, complete cds 


4e-49 


424uiyy 


(AB020712) KIAA0905 protein 
[Homo sapjens] 


ze--u 


1733 


AB020712 


Homo sapiens rnRNA 
for KIAA0905 
protein, complete cds 


3e-49 


4240299 


(AB020712} KIAA0905 protein 
[Homo sapiens] 


2e-:0 


1734 


X62295 


R.rattus rnRNA for 
vascular type-1 
angiotensin II 
receptor 


le-49 


1209756 


(U43629) integral membrane 
protein [Beta vulgaris) 


7e-L2 


1735 


AJ007509 


Homo sapiens rnRNA 
forElB-55kDa- 
associated protein 


le-49 


3319956 


(AJ007509) EiB-55kDa- 
associaced orotein 


4e-24 


1736 


X97303 


H. sapiens rnRNA for 
Pts-12 protein 


le-49 


466044 


HVPU1HEIICALZINL 
FINGER PROTEIN ZK686.4 
IN CHROMOSOME III 
>gi!6307S0|pir[|S44909 ZK686.4 
protein - Caenorhabditis elegans 
^a;nru^46 il 17337) coded for 
by C. elegans cDNAs 
GenBank:MSS869 and T01933; 
putative [Caenorhabditis 
eleaans] 


j 

3e-3< ' 


1737 


AF038404 


Unmfi ^ani^rK 

homolog of NeddS 
(hNeddf) rnRNA, 
complete cds 


4e-50 


<NONE> 


<NONE> 


<NON 


1738 


L43618 


Homo sapiens 
polycystic kidney 
disease (PKDL) gene, 
exons 35-42 


4e-50 


90375S 


(L43619) polycystic kidney 
disease 1 protein [Homo 
sapiens! 


3e-< 


1739 


AF009424 


Homo sapiens clone 
22 rnRNA, alternative 
splice variant alpha- 1, 
complete cds 


4e-50 


2271473 


(AF009426) clone 22 [Homo 
sapiens] 


5c 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












monosacchand transport protein 




1740 


L77040 


Homo sapiens 
(subclone 8_clt from 
PI H22) DNA 
sequence. 


2e«50 


99758 


STP4 - Arabidopsis thaliana 
>gi| 16524 (X66857) sugar 
transport protein [Arabidopsis 
thaliana] 


6.4 


1741 


L35657 


Homo sapiens 
(subclone H8 5_al0 
from PI 35 H5 CS) 
DNA sequence. 


2e-50 


2072960 


(U93568) p40 [Homo sapiens] 


2e-G5 


1742 


U80745 


Homo sapiens CTG7a 
mRNA, partial cds 


le-50 


j <NONE> 


<NONE> 


<NONE> 


1743 


D84514 


Bovine mRNA for 
p97, partial cds 


le-50 


3978527 


(AF10372S) structural 
polvprotein [Sindbis virus] 


9.9 


1744 


M22960 


Human protective 
protein mRNA, 
complete cds. 


le-50 


131081 


LYSOSOMAL PROTECTIVE 
PROTEIN PRECURSOR 
(CATHEPSIN A) 
(CARBOXYPEPTIDASE C) 
human >gi| 190283 ^122960) 
protective protein precursor 


Ie-12 


1745 


XS601S 


H.sapiens mRNA for 
MUFl protein 


le-50 


1082610 


mufl protein - human 
>gi|762953 (X8601S) mufl 
[Homo sapiens] 


Le-21 ! 


1746 


U03495 


Human transcription 
factor LSF-ID 
mRNA, complete cds. 


7e-51 


2136296 


transcription factor LSF - human 
>*i|476099 


le-21 


1747 


AB015344 


Homo sapiens 
HRIHFB2I57 
mRNA, partial cds 


5e-5i 


3970874 


(ABO 15344) HRIHFB2 157 
[Homo sapiens] 


2e-35 


1748 


M93339 


Human zinc finger 
protein mRNA. 


4e-51 


3024110 


MYC-ASSOCIATED ZINC 
FINGER PROTEIN sapiens] 


2e-06 


1749 


U71363 


Human zinc finger 
protein zfp6 (2F6) 
mRNA, partial cds 


4e-51 


2689441 


(AC0G36S2) F1S547_1 [Homo 
sapiens] 


2e-U 


1750 


X56932 


H. sapiens mRNA for 
23 kD highly basic 
protein 


4e-5l 


730451 


60S RIBOSoMAL.PROlklN 
L13A (23 KD HIGHLY BASIC 
PROTEIN) 

>gi|?45S97|pir!!S29539 basic 
protein. 23K - human >gi|2369 1 
(X56932) 23 kD highly basic 
protein [Homo sapiens] 


le-ll 


1751 


Z79054 


H. sapiens flow-sorted 
chromosome 6 
Hindll I fragment. 
SC6pA2iEll 


2e-5l 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 





Nearest Neiahbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) j 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1752 


AF068245 


BAF60b gene, partial 
sequence 


5e-52 


<NONE> 


<NONE> 


<NONE> 


1753 


AJ236932 


Sus scrofa mRNA for 
hypothetical protein 
(5'; clone 4B8) 


5e-52 


400927 


RIBONUCLEOPROTEIN 
RB97D ribonucleoprotein 
[Drosophila meianogaster] 


4.7 | 


1754 


AF003693 


Mus muscuius 
scaffold protein Pbpl 
homolog mRNA, 
complete cds 


6e-53 


2197106 


(AF003693) scaffold protein 
Pbpl homolog [Mus muscuius] 


2e-54 


1755 


M27319 


Human calmodulin 
mRNA, complete cds. 


5e-53 


115528 


CALMODULIN 
>gi|l02408|pir||JC1309 
calmodulin - Styionychia lemnae 
(SGC5)>gi|161195 


0.002 


1756 


M74555 


Mouse house-keeping 
protein mRNA, 
complete cds. 


5e-53 


284775 


house-keeping protein - mouse 
>2i|193S71 


5e-30 


1757 


X92720 


H. sapiens mRNA for 
phosphoenolpyruvate 
carboxvkinase 


6e-54 


2135915 


phosphoenolpyruvate 
carboxvkinase (GTP) (EC 
4.1.1.32) precursor, 
mitochondrial - human 
carboxvkinase (GTP) [Homo 
sapiensl 


6e-21 


1758 


AF007872 


Homo sapiens torsinB 
(DQi) mRNA, partial 
cds 


2e-54 


2760121 


(AB002405) LAK-4p [Homo 
sapiens] 


0.27 


1759 


U49507 


Mus muscuius 
B6CBA Lisch7 
mRNA, partial cds. 


2e-54 


1236083 


(U49507) Lisch7 [Mus 
muscuius] 


3e-27 


1760 


Z73360 


Human DiNA 
sequence from 
cosmid 92M18, 
BRCA2 gene region 
chromosome I3ql2- 
13. 


le-55 


2370371 


(Y 14657) hydrophobin 
[Pleurotus ostreatus] 
>gi|29S2620|gnl|PID|e 1283986 
(AJ225061) POH2 hydrophobin 
[Pleurotus ostreatus] 


2.0 


1761 


U83702 


Human cytochrome c 
oxidase subunit Via 
gene, exon 3 and 
complete cds 


8e-56 


2982994 


(AE0006S2) hypothetical 
protein [Aquifex aeolicus] 


7.0 


1762 


Y12781 


Homo sapiens mRNA 
for transducin (beta) 
like I protein 


7e-56 


3021409 


(Y127S 1) transducin (beta) like 
I protein [Homo sapiensl 


7e-39 
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Nearest Neishbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 




DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1763 


AB020673 


Homo sapiens mRNA 
for KIAA0866 
protein, complete cds 


oe-j / 




(AF001548) Myosin heavy 
chain (MHYll) (5partial) 
[rtomo sapiens] 




1764 


AJ236932 


Sus scrofa mRNA for 
hypothetical protein 
(5'; clone 4B8) 


3e-57 


400927 


RIBONUCLEOPROTEIN 

T?RQ7T""l ri hnn i f»nnrr)tpi n 
rvi-J y i \~J 1 1 kjk) 1 1 n\*, i cvj ui \j it. 1 1 1 

[Drosophila melanogaster] 


4.7 


1765 


L06900 


Human dystrophin 
gene, intron I 
containing pseudo 
exon. 


le-58 


, 4185129 


(AC005724) unknown protein 
[Arabidopsis thalianal thalianal 


7.0 


1766 


X93334 


H.sapiens 

mitochondrial DNA, 
complete genome 


9e-59 


1492050 


contagiosum virus subtype 11 


0.17 


1767 




Rattus sp. 7acomp 
protein mRNA, 
complete cds 


JV*J7 


J 1 U7U^U 


(AF064856) 7acomp protein 

n?it"Ttic crt 1 


2e-3 1 


1768 


a T?r\ o i a o i 

Ar08l484 


Homo sapiens alpha- 
tubulin isoform I 
mRNA, complete cds 






(X06956) alpha-tubulin [Homo 
sapiens] 


*tC — — 


1769 


X7 1427 


Homo sapiens mRNA 
for FUS-CHOP 
protein fusion 


le-ou 


/40JJ / 


(U23523) histidine-rich 
Hwuenornauuius eic^aiibj 


0.45 


1770 


AFO 13988 


Homo sapiens serine 
protease mRNA, 
complete cds 


le-60 


2564316 


(AB006622) No similarities to 
any reported proteins [Homo 
sapiens! 


0.26 


1771 


U25691 


Mus musculus 
lymphocyte specific 
helicase mRNA, 
complete cds 


7e-61 i 


2137490 


lymphocyte specific helicase - 
mouse musculus] 


3e-25 


1772 


X93334 


H.sapiens 

mifnrhondrinl DMA 

1111 IUL 1 lUllul 1 Ul 1— ' . »i 

complete genome 


4e-61 


70656 


ubiquitin / ribosomal protein 
S27a - human extension protein, 
HUBCEPS0 [human. Peptide, 
156 aa] ubiquitin extention 
protein [Cavia porcellus] 


9e-0S 


1773 


D3S255 


Homo sapiens mRNA 
for CAB 1, complete 
cds 


4e-61 


2135214 


sene MLN 64 protein - human 


4e-23 


1774 


U25691 


Mus musculus 
lymphocyte specific 
helicase mRNA. 
complete cds 


8e-62 


2137490 


lymphocyte specific helicase - 
mouse musculusl 


Se-26 



30^ 
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Nearest Neighbor ( BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1775 


M21731 


Human lipocortin-V 
mRNA. complete cds. 


oe-oz 




Human Annexin V With Proline 

Ci iKict i m i ririn T\ \r "Thir\nml inp 
oUubiiiumjii oy i uiupiuimc 




1776 


AF021936 


■p itfiiQ norVf*cr i CUS 

myotonic dystrophy 
kinase-related Cdc42- 
binding kinase 

beta) mRNA, 
complete cds 


2e-62 


2736153 


(AF021936) myotonic 
dystrophy kinase- related Cdc42- 
binding kinase MRCK-beta 
[Rattus norvegicus] 


3e-27 


1777 


Y 12059 


H.sapiens HUNKI 
mRNA 


le-62 


3184498 


(AC004798) R3 1546.1 [Homo 
sapiens! 


3e-09 


1778 


L37368 


Human (clone E5.1) 
RNA-binding protein 
mRNA. complete cds. 


oe-oj 




sialydase - Actinomyces viscosus 

>gl| 1-T I 0_J — 


7.8 


1779 


M27S77 


Figure 2. Nucleotide 
and translated protein 
sequences of HPFU - 
2, and -9. 


5e-63 


1731443 


ZINC FINGER PROTEIN 83 
(ZINC FINGER PROTEIN 
HPF1) >gi|106023|pir||A32891 
finger protein I, placental - 
human 


3e-33 


1780 


AF095448 


Homo sapiens 
putative G protein- 
coupled receptor 


2e-63 


3116131 


(AL023288) hypothetical 
protein 


4.6 


L781 


L 19437 


Onmnn trine T lnr\l T Cr 1 

riuman transaiuuiabc 
mRNA containing 
transposable element, 
complete cds 


2e-63 


1553119 


(U63159) transaldolase [Mus 
muscutus] 


4e-18 


1782 


L41351 


Homo sapiens 
prostasin mRNA, 
complete cds 


le-63 


2833277 


PROSTASIN PRECURSOR 
precursor - human >gi|862305 
(L41351) prostasin [Homo 
sapiens] >gi|l 143194 (U33446) 
prostasin [Homo sapiens! 


6e-14 


1783 


AF053470 


Homo sapiens lOkD 
protein (BC10) 
mRNA. complete cds 


6e-64 


482237 


hypothetical protein K03H1.9 - 
Caenorhabditis elesrans 


0.029 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1784 


D37791 


Mouse mRNA for 
beta- 1 ,4- 

galactosyltransferase 


6e-64 


3880102 


similar to F tve zinc 

finger; cDNA EST yk265b4.5 
comes from this gene; cDNA 
EST yk359g9.5 comes from this 
eene- cDNA EST vk319c2 5 
comes from this gene 
[Caenorhabditis eiegans] zinc 
finger; cDNA EST yk265b4.5 
comes from this gene; cDNA 
to 1 yKJoygy.o comes rrom mis 
gene; cDNA EST yk319c2.5 
comes from this gene 
[Caenorhabditis elegansl 


3e-16 


L785 


AF015770 


Mus musculus radical 
fringe (radical- fringe) 
mRNA, complete cds 


6e-64 


2204355 


(U94350) radical fringe 
precursor [Mus musculus] 


ie-36 


1 / 00 


Z79054 


H. sapiens flow-sorted 
chromosome 6 

T T" 1TTT C . a. 

Hmdlll fragment. 
SC6pA21Ell 


2e-64 


<NONE> 


<NONE> 


<NONE> 


1787 


M83094 


Homo sapiens 
cytosolic selenium- 
dependent glutathione 
peroxidase gene T 
complete cds, and 
rhohl2 2ene, 3' end. 


ie-64 


2447063 


(U42580) A565R [Paramecium 
bursaria Chlorella virus 1 1 


8.8 


1788 


Y10211 


H. sapiens LAG-3 
gene, promoter region 


7e-65 


1944540 


(X141 12) tegument protein 
[human herpesvirus 1] 


2.3 


1 / 0 7 


Ml 9045 


Human lysozyme 
mRNA, complete cds. 


2e-65 


<NONE> 


<NONE> 


<NONE> 


1790 


U018S2 


Homo sapiens SS- 
A/Ro autoantigen 52 
kda component gene, 
complete cds 


2e-65 


5S5401 


LIP ASK MODULAlOk 
PRECURSOR (LIPASE 
HELPER PROTEIN) 
>gi|480045ipir||S36249 HpB 
protein - Pseudomonas glumae 
>gi|49207 (X70354) helper 
protein 


4.2 


1791 


AF069517 


Homo sapiens RNA 
binding protein DEF- 
3 mRNA, complete 
cds 


2e-65 


3212101 


(AF069517) RNA binding 
protein DEF-3 [Homo sapiens] 


le-25 
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Nearest 


Neighbor 'BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX v s . Non-Redundant Proteins^ 


SEQ 
ED 


ACCESSION 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens jerky 










1792 


AF004715 


gene product 
homolog mRNA. 
complete cds 


2e-65 


2314829 


(AF0047 15) jerky gene product 
homolog [Homo sapiens] 


2e-45 


1793 


X59652 


C. longicaudatus hprt 
mRNA for 
hypoxanthine 


3e-66 


631625 


hypoxanthine (guanine) 
phosphoribosyltransferase - long 
tailed hamster 
phosphoribosyltransferase 
[Cricetulus longicaudatus] 


6e-54 


1794 


U94350 


Mus musculus radical 
fringe precursor 
mRNA, complete cds 


3e-67 


2204355 


(U94350) radical fringe 
precursor [Mus musculus] 


2e-33 


1 7Q^ 




Mus musculus 
putative 

lysophosphatidic acid 
acyltransferase 
mRNA. complete cds 


3e-68 


23177^5 


(AF0158U) putative 
lysophosphatidic acid 
acyl transferase [Mus musculus] 


7e-5I 


1796 


J03137 


Cow 

phosphoinositide- 
specific 

phospholipase C 


3e-69 


U 9 26908 


phospholipase C 154 [Bos 
taurus] 


3e-25 


1797 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2,4-dienoyl-CoA 
reductase (DCR- 
AKL) mRNA, 
complete cds 


le-69 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA | 
reductase [Rattus norvesicus] 


2e-33 


1798 


AF015811 


Mus musculus 
putative 

lysophosphatidic acid 
acyl transferase 
mRNA, complete cds 


4e-70 


2317725 


(AF015311) putative 
lysophosphatidic acid 
icy I transferase [Mus musculus] 


3e-i9 


1799 


1 

X65157 | 


VI. musculus mRNA 
■"or desmoyokin, 
martial 


5e-74 


c 

109781 : 


desmoyokin - mouse (fragment) 
>gi|50675 


9e-37 


1800 


I 

297207 t 


vlus musculus mRNA 
or B-IND1 protein 


2e-74 


( 

2231019 r 


Z97207) B-INDl protein [Mus 
nusculus] 


6e-21 


1801 


( 

t 

U27196 r 


3a[lus gailus zinc 
ingcr protein (Fzf-1) 
nRNA, complete cds. 


6e-75 


( 

984814 f 


U27196) zinc finger protein 
Gailus callus] callus] 


?e-44 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












70 KD WD- REPEAT TUMOR- 




1802 


Y 15054 


Rattus norvegicus 
mRNA for 70 kDa 
tumor specific 
antigen, partial 


3e-77 


3123027 


SPECIFIC ANTIGEN 
>gi|2505957|gni|PID|e353992 
(Y 15054) 70 kD tumor-specific 
antigen [Rattus norvegicusl 


4e-42 


1803 


X65157 


M.muscuius mRNA 
for desmoyokin. 
partial 


3e-79 


109781 


desmoyokin - mouse (fragment) 
>gi|50675 


9e-33 


1804 


U50736 


Rattus norvegicus 
cardiac adnamycin 
responsive protein 
mRNA. complete cds 


2e-84 


1362781 


cytokine inducible nuclear 
protein C193 - human 
>gi|793841 (XS3703) nuclear 
protein [Homo sapiens] 


7e-30 


1805 


AF072S65 


Rattus norvegicus 
thioredoxin reductase 
(TrxR2) mRNA, 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


2e-84 


3757888 


(AF072865) thioredoxin 
reductase [Rattus norvegicusl 


6e-43 


L306 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2,4-dienoyl-CoA 
reductase (DCR- 
AKL) mRNA. 
complete cds 


6e-S5 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rattus norvegicus] 


Le-41 


1807 


UI9181 


Rattus norvegicus 
Rabin3 mRNA, 
complete cds. 


2e-87 


624225 


(U191S1) Rabin3 [Rattus 
norvegicus] 


2e-41 


1808 


U40342 


Mus musculus ninein 
mRNA. complete cds. 


ie-9i 


1113865 


(U40342) ninein [Mus 
musculus] 


2e-36 


1809 


X67877 


R. norvegicus mRNA 
for cytosolic 
resiniferatoxin- 
binding protein 


4e-92 


136077 


TROPOMYOSIN BETA 3, 
FIBROBLAST chicken 
>gi|5 15694 (M230S2) 
tropomyosin [Gallus gallus] 


0.56 


1310 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2,4-dienoyl-CoA 
reductase (DCR- 
AfCL) mRNA. 
complete cds 


5e-93 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rattus norvegicus] 


le-50 


1811 


AF035527 


Mus musculus EHF 
(Ehf) mRNA. 
complete cds 


2e-95 


3138930 


(AF035527) EHF [Mus 
musculus] 


2e-47 



i 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
LD 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1812 


ABO 16930 


Cricetulus griseus 
mRNA for 

Phosphatidyl glycerop 
hosphate synthase, 
complete cds 


6e-96 


4159682 


(ABO 16930) 

Phosphatidylglycerophosphate 
synthase [Cricetulus griseusl 


7e-41 


1813 


AB005549 


Rattus norvegicus 
mRNA for atypical 
PKC specific binding 
protein, complete cds 


7e-97 


3868778 


(AB005549) atypical PKC 
specific binding protein [Rattus 
norvegicus] 


3e-41 


1814 


X90849 


G.gallus PB 1 gene 


2e-97 


; 2134381 


puiyoroiiio l protein - cnicKen 
chicken >gi|951231 (X90849) 
polybromo I protein (Gall us 
gallus] 


ie-34 


1815 


S79873 


h-Iamn- ^ = 1 vsosome- 

11 1U111U — I J 0\J Jvll IW 

associated membrane 
protein-2 protein-2b 
(LAMP2) mRNA, 
alternatively spliced 
form h-lamp-2b, 
complete cds. 


3e-98 


<NONE> 


<NONE> 


<NONE> 


1816 


U67203 


\/fnc m ncfi 1 1 1 tc At^F / 

neural isoform 1 
(mACF7) mRNA, 
partial cds 


2e-98 


1675224 


(U67204) ACF7 neural isoform 
2 [Mus musculus] 


9e-39 


1817 




Rattus norvegicus 
nuclear-encoded 
mitochondrial 
elongation factor G 
mRNA, complete cds. 


e- iuu 


jo_)Uo4 


FT nNGATTON FACTOR O 
MITOCHONDRIAL 
PRECURSOR (MEF-G) 
>gi|543383|pir||S40780 
translation elongation factor G, 

m!»A,-.UnnJnnl * .-rill 1 A t AO 

mitochondrial - rat >gip 1U1UZ 




1818 


XS4692 


M.musculus Spnr 
mRNA for RNA 
binding protein 


e-133 


1363238 


spermatid perinuclear RNA- 
binding protein Spnr - mouse 
>gi|673454 (X84692) spermatid 

protein [Mus musculusl 


5e-35 


1819 


U50736 


Rattus norvegicus 
cardiac adriamycin 
responsive protein 
mRNA. complete cds 


e-113 


1362781 


cytokine inducible nuclear 
protein C193 - human 
>gi|793841 (XS3703) nuclear 
protein [Homo sapiens) 


2e-36 


1820 


S66S55 


HoxB9=Hox-2.5 
[mice, embryos, 
mRNA Partial, 786 
ml 


e-107 


1708355 


HOMEOBOX PROTEIN HOX- 
B9 (HGX-2.5) 


Se-37 
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SEQ 
ID 


Nearest 
ACCESSION 


Neighbor (BlasiN vs. ( 

J DESCRIPTION 
HoxB9=Hox-2.5 


3enbank) 
P VALUE 


Nearest Neigh 
! ACCESSION 


bor(BIastX vs. Non-Redundant P 
DESCRIPTION 


P VALUE 


1821 


S66855 


[mice, embryos, 
iTLrviNM. i aniai, /oo 
nt) 


e-108 


1708355 


HOMEOBOX PROTEIN HOX 
B9(HOX-2.5) 


4e-37 


1822 


U92072 


Rattus norvegicus m- 
tomosyn mRNA, 
complete cds 


e-102 


3790389 


(U92072) m-tomosyn [Rattus 
norvegicus 1 


2e-38 


1823 


D 17577 


Mouse mRNA for 
kinesin-Iike protein 
(Kiflb), complete cds 


e-129 


2497524 


KINESIN-LIKE PROTEIN 
KIF1B mouse 

>gi|407339|gnl|PID|d 1005029 
(D 17577) Kiflb [Mus 
musculus] 


2e-39 


1824 


AF062484 


Mus musculus SDP8 
mRNA, complete cds 


e-122 


3126981 


(AF0624S4) SDPS [Mus 
musculus] 


5e-40 


1825 


. 

X73683 


R.norvegicus mRNA 
for histone H3.3 


e-109 


122075 i 


(H3.3Q) histone H3.3 - fruit fly 
(Drosophila melanogaster) 
histone H3.3B - chicken 
>gi|21I9023jpir||S6121S histone 
H3.3 - fruit fly (Drosophila 
hydei) 1-1361 [Oryctoiagus 
cuniculus] >gij8046 (X53822) 
Histone H3.3Q gene product 
[Drosophila melanogaster] 
->gip l i.vo ganusj >gijioiiyu 
(M17S76.) histone H3 [Spisula 
solidissima] >gi|2H853 
(Ml 1393) histone 3.3 [Gallus 
gallus] >gi|30684S (Ml 1354) 
H3.3 histone [Homo sapiens] 
melanogaster] >gi|96303 I 
[XS1205) hisione H3.3 H3.3A 
variant [Drosophila 




1826 


j 
i 

U67203 


VIus musculus ACF7 
leural isoform 1 
mACF7) mRNA, 
martial cds 


e-102 


( 

1675224 


U67204) ACF7 neural isoform 
I [Mus musculus] 


2e-40 


1827 


I 
V 

D 1 7577 ( 


vlouse mRNA for 
;inesin-Iike protein 
Kiflb), complete cds 


e-131 


( 

2497524 r 


ONES IN-LIKE PROTEIN 
CIFIB mouse 

>gi|407339;gnI|PID|d 1005029 
D17577) K::lb [Mus 
nusculus] 


7e-42 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIONS 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1828 


AB016930 


Cricetulus griseus 
mRNA for 

Phosphatidylgiycerop 
hosphate synthase, 
complete cds 


e-13l 


4159682 


(AB016930) 

Phosphatidy [glycerophosphate 
synthase [Cricetulus griseus] 


3e-43 


1829 


U09874 


Mus musculus SKD3 
mRNA, complete cds. 


e-122 


2493735 


SKD3 PROTEIN SKD3 [Mus 
musculusl 


7e-48 


1830 


X99145 


C.familiaris mRNA 
forC3VS protein 


e-110 


1429314 


(X99145) overexpressed in 
thyroid tissue after TSH 
stimulation [Canis familiaris] 


2e-49 


1831 


X99836 


P.walti mRNA for 
rnp associated protein 
55 


e-I06 


4200286 


(X99836) rap55 [Pieurodeles 
waltl] 


2e-50 


1832 


AF077003 


Mus musculus SH3 
domain-containing 
adapter protein 
mRNA. complete cds 


e-121 


3550240 


(AF077003) SH3 domain- 
containing adapter protein; 
CD2AP 


3e-51 


1833 


AF060246 


Mus musculus strain 
C57BL/6 zinc finger 
protein 106 (Zfpl06) 
mRNA, H3a-a allele, 
complete cds 


e-1 18 


3372657 


(AF060246) zinc finger protein 
106 [Mus musculus] 


le-52 


1834 


Z 14030 


R.norvegicus mRNA 
for TRAP-complex 
gamma subunit. 


e-120 


1174453 


1KAIN5L.UCUIN- 

ASSOCIATED PROTEIN, 
GAMMA SUBUNIT (TRAP- 
GAMMA) (SIGNAL 
SEQUENCE RECEPTOR 
GAMMA SUBUNIT) (SSR- 
GAMMA) 

>gi|423185|pirj|S33294 \ 
translocon-associated protein 
samma chain - rat norvesicus] 


7e-54 


1835 


AF077003 


Mus musculus SH3 
domain-containing 
adapter protein 
mRNA. complete cds 


e-1 32 


3550240 < 


(AF077003) SH3 domain- 
:ontaining adapter protein; 
CD2AP 


5e-54 


1836 


< 
< 
j 

L20427 i 


Rattus norvegicus 

dihydroxypolyprenylb 

^nzoate 

nethyltransterase 
tiRNA, complete cds 


e-116 


<. 
i 
(. 
i 

457372 i 


.L20427) 

dihydroxypol> prenylbenzoate 
nethyltransterase 
lihydroxypoiy prenylbenzoate 
nethyltransterase [Rattus 
iorvegicus| 


4e-56 



3n 
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Is Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neishbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 






P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












PROTEIN TSG24 (MEIOTIC 




1837 


X80169 


M.musculas mRNA 
for 200 kD protein 


e-122 


1717793 


CHECK POINT 
REGULATOR) 
>gi|1083553|p ir ||A55117 tse24 


2e-56 


1838 


AF080568 


Rattus norvegicus 

CTPrphosphoethanoI 

amine 

cytidylyl transferase 
mRNA. complete cds 


e-119 


3396102 


(AF080568) 

CTP:phosphoethanoIamine 
cytidylyl transferase 


6e-5S 


1839 


X99145 


C.familiaris mRNA 
for C3VS protein 


e-121 


1429314 


(X99145) overexpressed in 
thyroid tissue after TSH 
stimulation [Canis familiarisl 


2e-5S 


1840 


AFO 19075 


Pan troglodytes breast 
and ovarian cancer 
susceptibility 
(BRCA1) gene, 
partial cds 


e-145 


2218154 


( AF00506S) breast and ovarian 
cancer susceptibility protein 
splice variant [Homo sapiens] 


leoS 


1841 


U55042 


Bos taurus myosin X, 
complete cds 


e-122 


I 1755049 


(U55042) myosin X [Bos 
taurus] 


Ie-61 


1842 


AJ007780 


Mus musculus mRNA 
for poiy(ADP-ribose) 
poIymerase-2 


e-119 


3283975 


(AF072521) poly-(ADPribosyl)- 
transferase homolos PARP 


4e-62 


1843 


AF072865 


Rattus norvegicus 
thio redox in reductase 
(TrxR2) mRNA, 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


e-105 


3757888 


(AF072865) thioredoxin 
reductase [Rattus norveaicus] 


3e-62 I 


1844 


U55042 


Bos taurus myosin X, 
complete cds 


e-121 


1755049 


(U55042) myosin X [Bos 
taurus] 


le-62 


1845 


X61506 

< 1 J \J\J 


Mouse E46 mRNA 

IKJl piULClIl 


e- 1 _>v 




3 RAIN PROTEIN E46 


9e-67 


1846 


] 
( 

D90335 £ 


3ovine mRNA for 
TTP-binding protein 
Upha-subunit 


e-14S 


l 

585174 


(JUAN INK NUCLHOTIDK- 
BINDING PROTEIN, ALPHA- 
14 SUB UNIT (GL1) 
>gi| 10871 ljpir||A40891 GTP- 
Dtnding protein GL 1 alpha chain 
bovine protein, alpha-subunit 
Bos taurus \ 


2e-69 


1847 


I 
I 

U49507 r 


vlus musculus 
36CBA Lisch7 
nRNA. partial cds. 


e-140 


( 

2121326 s 


AC00212S) Lisch7 [Homo 
apiens] 


2e-74 
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Table 4 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
[D 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


I 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


2 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


3 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


i 4 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


5 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


6 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


7 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


8 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


9 


1 <NONE> 


<NONE> 


<NONE> 


<NONE> 


! <NONE> 


<NONE> 


10 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


il 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


L2 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


L3 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


14 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


15 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


16 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


17 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


IS 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


19 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


20 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


21 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


22 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


23 


<none> ; 


<NONE> 


<NONE> 


1079469 


lMDC I protein - crab-eating 
macaque 


9.3 


24 


<NONE> 


<NONE> 


<NONE> 


3043656 


(AB01113S) KIAA0506 protein 
[Homo sapiensl 


9.3 


25 


<NONE> 


<NONE> 


<NONE> 


112175 


potassium channel protein RK5 - 
rat protein [Ratius norvegicusl 


8.6 


26 


<NONE> 


<NONE> 


<NONE> 


3769624 


(AF091565) olfactory receptor 
[Rattus norvegicus] 


7.2 


27 


<NONE> 


<NONE> 


<NONE> 


3876443 


(ZSI517) F2SB1.6 
[Caenorhabditis elegansl 


7.1 


28 


<NONE> 


<NONE> ! 


<NONE> 


2224464 


(AB001684) ORF249 [Chloreila 
vulgaris] 


6.9 


29 


<NONE> 


<NONE> 


<NONE> 


1519707 


(U67940) ORFveglOfc; random 
cDNA sequence [Diccyostelium 
discoideum] 


6.7 


30 


<NONE> 


<NONE> 


<NONE> 


227491 


protein kinase C II [Xenopus 
laevis] 


6.7 


31 


<NONE> 


<NONE> 


<NONE> 


630575 


C50C3.4 protein - 
Caenorhabdiiis eieaans 


6.0 


32 


<NONE> 


<NONE> 


<NONE> 


137290 


35 KD PROTEIN IN RNA2 
clover necrotic mosaic virus 
>gi|61466 (X0SO2H ORF for 35 
kDa polypeptide ( A A 1-317; 
(Red clover necrotic mosaic 
virus] 


6.0 
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Nearest Neighbor (BlastN vs. Genbank) 




Nearest Neighbor (BlastX vs. Non-Redundant PrmeTnTT 



DESCRIPTION 



(X167U) pid:g3004I [Homo 



P VALUE 



CELL DIVISION PROTEIN 
FTSW 



(D63999) hypothetical protein 



NITROGEN REGULATORY 
PROTEIN AREA 

MITOCHONDRIAL 

RIBOSOMAL PROTEIN S5 
Emericella nidulans 
mitochondrion (SGC3) 
>gi| 12709 nidulans] >gi|472822 
(JO 1390) unknown 



5.7 



5.7 



5.2 



4.3 



— j protein 

{AL0J4TO) predicted using 

Genefinder; similar to WD 
domain, G-beta repeat; cDNA 
EST yk362f7.5 comes from this 
gene; cDNA EST yk362f7.3 
comes from this gene 
[Caenorhabditis elegans 



(U3I329) polyketide synthase 
[Aspergillus terreus] 



(AL031530) hypothetical zinc 
finger protein 

[Schizosaccharomyces pombe] 



AXONEME-ASSOCIATED 
PROTEIN MSTIOl(l) product 
fDrosophila hydei] 

HYPOTHETICAL 21.7 ICD 

PROTEIN IN INTE-PIN 
INTERGENIC REGION 
>gi| 1787402 (AE000214) orf, 
hypothetical protein 
Escherichia coli] 



(AF071556) anthranilate 
dioxygenase large subunit 



(U43139) envelope glycoprotein 
gpl20 [Human 
immunodeficienc 



4.0 



3.3 



3.0 



2.6 



2.5 



2.4 



(Z75536) similar to dynein 
heavy chain; cDNA EST 
EMBL:D27549 comes from this 
gene; cDNA EST 
EMBL:D34859 comes from this 
gene [Caenorhabditis elegans! 



1.4 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



46 I <NQNE> 



47 | <NONE> 



48 



<NONE> 



DESCRIPTION 



<NONE> 



<NONE> 



<NONE> 



49 



<NONE> 



50 



<NONE> 



51 



<NONE> 



52 I <NONE> 



P VALUE 



Nearest Neighbor (BlastX vs^ Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



<NONE> 



3881150 



KAL032647) predicted using 



Genefinder 
ICOLANIC ACID CAPSULAR 



<NONE> 



132200 



BIOSYNTHESIS 
ACTIVATION PROTEIN A 
>gi|95605|pir||S 17701 res A 
prote in 



<NONE> 



2204286 



(U61380) germination protein 
[Bacillus megateriurn] 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



53 I <NONE> 



<NONE> 



<NONE> 



<NONE> 



1723955 



<NONE> 



3201564 



<NONE> 



2808721 



<NONE> 



602434 



<NONE> 



3347955 



<NONE> 



<NONE> 



1255SS7 



H ! PU 1 HE TlL AL 11.4 KLT 
PROTEIN IN FOX1-KEX1 
INTERGENIC REGION 
>gi|2132566|pir||S64222 
probable membrane protein 
YGL204c - yeast 
(Saccharomyces cerevisiae) 
>gi|1322838|gnl|PID|e243803 
(Z72726) ORF YGL204c 
[Saccharomyces cerevisiae] 



(AJ006514) prolipoprotein 
diacylglyceryl transferase 
Vibrio cholerae] 



P VALUEI 



(A102142S) hypothetical 
protein Rv0064 



(U17986) GABA/noradrenaline 
transporter [Homo sapiens] 



(AF076184) cytosolic sorting 
protein PACS-lb [Rattus 
norvegicus] 



U0JJ44J coded tor by L. 

elegans cDNA yk92b4.5: coded 
for by C. elegans cDNA 
yk73al.5; coded for by C. 

legans cDNA ykl02e9.5; 
coded for by C. elegans cDNA 
yk71c8.5; coded for by C. 
elegans cDNA yk66d 11.5; 
coded for by C. elegans cDNA 
yk66c 3... 



1.4 



1.1 



1.0 



0.84 



0.31 



0.27 



o.i: 



0.12 



0.074 



55 1 <NONE> 



<NONE> 



<NONE> 



103076 



3km-Iike sex-determining 
region hypothetical protein 
CS3 14 - fruit fly (Drosophila 
melanosaster) 



56 I <NONE> 



<NONE> 



<NONE> 



107560 



Ras inhibitor (clone JC265) 
human sapiens] 



0.003 



WO 01/02568 



PCT/US00/18374 





! Nearest Neighbor (BlastN vs. Genbank) 


i Nearest Neighbor (BlastX vs. Non-Rednnrfnnr Pmt^in^i 


CCA 

ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












Bkm-Iike sex -determining 




57 


<NONE> 


<NONE> 


<NONE> 


103076 


region hypothetical protein 
CS3 14 - fruit fly (Drosophila 
melanogaster) 


2e-04 


58 


<NONE> 


<NONE> 


' <NONE> 


2702370 


(AF038604) contains similarity 
to Drosophila ovarian tumor 
locus protein (GB:X13693) 
[Caenorhabditis elegansl 


6e-05 


59 


<NONE> 


<NONE> 


<NONE> 


3859713 


(AL033501) phox domain 
protein [Candida albicans] 


3e-05 


60 


<NONE> 


<NONE> 


<NONE> 


2088839 


(AF003386) F59E12.5 gene 
product [Caenorhabditis 
elegans] 


2e-08 


61 


<NONE> 


<NONE> 


<NONE> 


121059 


GC-RICH SEQUENCE DNA- 
BINDING FACTOR GCF - 
human >gi|179412 (M29204) 
DNA-binding factor [Homo 
sapiens] 


4e-09 


62 


<NONE> 


<NONE> 


<NONE> 


3875246 


U.514VU) simuar to WD 
domain, G-beta repeats (2 
domains); cDNA EST 
EMBL:T00482 comes from this 
gene; cDNA EST 
EMBL:T00923 comes from this 
gene; cDNA EST yk449d4.3 
comes from this gene; cDNA 
EST yk449d4.5 comes from this 
gen... 


9e-24 




<NONE> j 


<NONE> 


<NONE> 


1465S34 


(U64857) No definition line 
found [Caenorhabditis elegans] 


9e-2S 


64 


<NONE> 


<NONE> 


<NONE> 


3327136 


(AB014561) KIAA0661 protein 
[Homo sapiens] 


lc-29 


65 


<NONE> 


<NONE> 


<NONE> 


3880433 


(Z66521) similar to 
mitochondrial RNA splicing ' 
MSR4 like protein; cDNA EST 
EMBL:C09217 comes from this 
gene [Caenorhabditis elegans] 


8e-31 


66 


D42133 


Rat annex in V gene, 
exon7 and exon8 


5.0 


<NONE> 


<NONE> 


<NONE> 


67 


L35679 ] 


rlomo sapiens 
[subclone H8 2_d 1 1 
from PI 35 H5 C8) 
DNA sequence. 


5.0 


< 
< 
1 
i 

1086902 t 


(U412/S) coded tor by C. 
slegans cDNA yk79g8.5; coded 
for by C. elegans cDNA 
:ml0cS; coded for by C. elegans 
:DNA yk79g8.3; similar to 
eucine-rich repeats found in 
nany proteins [Caenorhabditis 
Regans] 


6.6 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non.R,HMnH.n, p™,~;^ 


ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






HIV-i strain BX220 










68 


U90184 


from USA, envelope 
glycoprotein C2V3 
region (env) gene, 
partial cds 


5.0 


1297070 


(271986) convicilin precursor 
[Vicia narbonensis] 


6.6 


69 


U61465 


Human myosin Vila 
(MY07A) gene, 5' 
exon 37 


5.0 


2313225 


(AE000535) L-lactate permease 
(IctP) [Helicobacter pylori 
26695] 


I J 


70 


AF013717 


Homo sapiens 
periplakin (PPL) 
mRNA, partial cds 


5.0 


' ' 3719238 


(AF064869) brain-enriched 
guanylate kinase-associated 
protein 2; BEGA2 [Rattus 
norvegicus] 


3.8 


71 


X58245 


Soybean mRNA for 
HMG-1 like protein 


5.0 


2995363 


(AL022245) biot'in synthase 


0.99 


72 


API 02425 


Frasera paniculata 
tRNA-Leu (trnL) 
gene, intron, 
chloroplast sequence 


4.9 


3522958 


(AC004411) putative 
pectinesterase [Arabidopsis 

lllallcllldj 


0.4 


73 


X82817 


H. sapiens 

PTP1C7HCP- variant 
gene 


4.9 


3875514 


^014^4; cDNA EM 
EMBL:D27474 comes from this 
gene; cDNA EST 
EMBL:D27473 comes from this 
gene; cDNA EST 
EMBL:T00471 comes from this 
gene; cDNA EST 
EMBL:D34192 comes from this 
gene; cDNA EST 
EMBL:D37241 comes from this 
gene; ... 


2.8 


74 


] 
1 

U04827 | 


Vlus musculus brain 
~atty acid-binding 
protein 


4.9 


< 
i 

3676132 


(AL031765) 1- 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score=31.96; 1- 
evidence_end; 2- 
2vidence=predicted by match; 2- 
natch_accession=SPTREMBL: 
393319; 2- 

natch_description=HYPOTHE 
riCAL PROTEIN C33A1 1.2.;... 


2e-09 


75 


I 
t 

AF038859 c 


^eospora hughesi 
train NE1 internal 
ranscribed spacer 1, 
omplete sequence 


4.8 


<NONE> 


<NONE> 


<NONE> 



317 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BiastN vs. Genbank) 



ACCESSION DESCRIPTION 



76 



77 



78 



iM.musculus MFH-1 



Y08222 



gene 



AJ224475 



Borrelia burgdorferi 
left chromosomal 
subtelomeric region 
(pfpB gene) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE I ACCESSION 



4.8 



<NONE> 



U02486 



IMus musculus LAP 
putative membrane 
protein (KRAG) 

Igene, exon 3 and 

[complete cds 



4.8 



DESCRIPTION 



<NONE> 



(AJ236702) HMR1 protein 
4218141 [Antirrhinum majus] 



4.8 



3258103 



(AP000006) 367aa long 
hypothetical protein 
[Pyrococcus horikoshii] 



P VALUE 



<NONE> 



8.3 



2.7 



79 



ABO0O28O 



Rat mRNA for 
peptide/histidine 
transporter, complete 
cds 



4.8 



806317 



(M29067) unknown protein 
[Saccharomyces cerevisiae] 



0.001 



80 



Z49771 



A.cepa mitochondrial | 
gene for NADH 
dehydrogenase 
subunit 3 and 
Iribosomal protein 
S12 



4.5 



<NONE> 



<NONE> 



<NONE> 



81 



M63494 



I Mouse IgG receptor 
(beta-Fc-gamma-RII) 
gene, exons 6 and 7, 
clones lambda- 

|Fc(3.2,93). 



4.3 



<NONE> 



<NONE> 



<NONE> 



82 



Z14035 S.pombe carl gene 



2.0 



3790665 



(AF099000) No definition line 
found [Caenorhab diti s elegans] 



1.2 



83 



U17129 



IRhodococcus 
jerythropolis ThcA 
|(thcA) gene, complete! 
cds; and unknown 
[genes 



84 



AE001386 



Plasmodium — 
falciparum 
chromosome 2, 
section 23 of 73 of 
the complete 
sequence 



2.0 



2828280 



(AL021687) putative protein 
[Arabidopsis thaliana] 
>gi|2832633|gnl|PID|el249651 
(AL02171 1) putative protein 
Arabidopsis thaliana] 



2e-26 



2.0 



4176500 



85 



Human clone 23734 
U79292 mRNA sequence 



1.9 



86 



V00159 



IChloroplast Euglena 
gracilis gene coding 
for the 5S and 16S 
rRNA. 



<NONE> 



1.9 



<NONE> 



(AL031177) dJ889M15.3 (novel 
protein) 



9e-59 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



Mi 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Gen bank) 



SEQ 

IP 1 ACCESSION 



87 



88 



89 



90 



91 



92 



93 



94 



95 



96 



DESCRIPTION 



97 



U95094 



X93206 



U60979 



X56272 



L22383 



U82814 



U 18504 



Xenopus laevis XL- 



INCENP (XL- 
INCENP) mRNA 

complete cds 

H.salinarium TATA 
box-binding protein 
enes and ORF s 

Caenorhabditis 
elegans programmed 
cell death specifier 
(ces-2) gene, 
complete cds 



C tentans ORFs (A- 
E) for hemoglobin 



Homo sapiens DNA 
sequence, repeat 
resion. 



Hirudo medicinalis 
neuron-specific 
protein rrtRNA, 
complete cds 



X53676 



Haplomitrium 
hookeri 18S rRNA 
gene, partial 

sequence. 

Pseudomonas stutzeri 



Nearest Neighbor (BlastX vs. Non-Redundant Protei 



P VALUE | ACCESSION 



DESCRIPTION 



1.9 



1.9 



<NONE> 



<NONE> 



1.9 



<NONE> 



1.9 



<NONE> 



ns) 



P VALUE I 



<NONE> 



<NONE> 



<NONE> 



1.9 



<NONE> 



1.9 



3822533 



1.9 



1083969 



nosDFY genes 
involved in copper 
processing 



U60086 



U33447 



M81327 



Dictyostelium 
discoideum multidrug 
resistance 
transporter/Ser 
protease (tagC) 
mRNA, complete cds. 
Human putative G- 



protein-coupled 
receptor (GPR17) 
gene, complete cds 



Sus scrofa Iactoferrin 
mRNA. complete cds. 

:: gbjI2S42i|I28421 
Sequence 5 from 
patent US 5571691 



1.9 



2980781 



<NONE> 



<NONE> 



(AF094531) immunoglobulin 
heavy chain precursor 



hypothetical protein 6 - fowlpox 
virus virus] 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



2.0 



(AL022 198) putative protein 



1.9 



3879530 



1.9 



3880034 



1.8 



<NONE> 



(Z49I30) cDNA EST 
yk486b9.3 comes from this 
gene; cDNA EST yk486b9.5 
comes from this gene 



(Z75550) similar to cell division 
control protein [Caenorhabditis 
elegans] 



<NONE> 



2.0 



0.70 



6e-05 



7e-14 



<NONE> 



WO 01/02568 
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SEC 
ID 


1 Nearest 
) 1 

Iaccessio* 


Neighbor (BlastN vs. 

DESCRIPTION 
S.iniae IctP & IctO 


Genbank) 
P VALUE 


1 Nearest Neiah 
1 ACCESSION 


bor (BlastX vs. Non-Redundant F 
| DESCRIPTION 


roteins) 1 
P VALUE 


98 
99 


j Y07622 
I M60474 


genes and ORF1 

Mouse myristoylated 
alanine-rich C-kinase 
substrate (MARCKS) 
mRNA, complete cds 


1.8 
1.8 


1 <NONE> 
I <NONE> 


<NONE> 
<NONE> 


<NONE> 1 
<NONE> 1 




1 Y 13901 


Homo sapiens FGh'K- 
4 gene 


1.8 


J <NONE> 


<NONE> 


<NONE> 1 


i r\ i 
1U1 


1 U44400 


Human Down 
Syndrome region of 
chromosome 21, 
clone A31D6-1D6. 


1.8 ' 


<NONE> 


<NONE> 


<NONE> I 


102 


U92808 


Ruminococcus albus 
beta-glucosidase 
(gluA) mRNA, 
complete cds 


1.8 


1 <NONE> 


1 <NONE> 


<NONE> 


103 


L25051 


Candida albicans 
argininosuccinate 
lyase (ARG4) gene, 
complete cds. 


1.8 | 


<NONE> 


<NONE> 


<NONE> j 


104 


AE000546 


Helicobacter pylori 
26695 section 24 of 
134 of the complete 
genome 


1.8 \ 


<NONE> 


<NONE> 


<NONE> J 


105 J 


J00978 


Xenopus laevis major 
beta-globin gene, 
complete cds. 


1.8 


<NONE> 


<NONE> 


<NONE> 1 


106 1 


U41716 ( 


Human 

Immunodeficiency 
virus type 1 isolate 
FW95-5, vpr gene, 
:ornplete cds. 


1.8 | 


<NONE> 


<NONE> 


<NONE> 


107 J 


( 

X66286 I 


j.gallus mRNA for 
ensin 


l.O | 


<NONE> j 


<NONE> 


<NONE> 


108 1 


I 

U76636 c 


<enopus calbindin 
D28k mRNA, 
omplete cds 


1.8 f 


<NONE> 


<NONE> 


<NONE> 


109 


r 

J00664 A 


abbit embryonic beta- 
Uglobin sene. 


18 


<NONE> 


<NONE> 


<NONE> 


110 | 


I 

( 

M21535 n 


luman erg protein 
sts-related gene) 
iRNA. complete cds. 


1.8 1 


( 

2983160 p 


AE000693) hypothetical 
rotein [Aquifex aeolicus] 


7.7 1 



WO 01/02568 



PCT/US00/18374 



sec; 
id 


Nearest 

) 

ACCESSIOI 


Neighbor (BlastN vs. 
^1 DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 

ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Voteins) 
P VALUE 


111 


I M80829 


Rat troponin T 
cardiac isoform gene 
complete cds 


1.8 


999450 


(Z46595) incomplete interleuki 
1 1 receptor isoform [Homo 
sapiens] 


7.3 


112 


D37887 


Cyprinus carpioc- 
myc gene for c-Myc, 
complete cds 


1.8 


3023408 


BRANCHED-CHAIN AMINO 
ACID TRANSPORT SYSTEM 
CARRIER PROTEIN 
(BRANCHED-CHAIN AMINO 
ACID UPTAKE CARRIER) 
>gi|1075007|pir||D64056 
membrane- associated 
component, branched amino 
acid transport system (brnQ) 
homolog - Haemophilus 
influenzae (strain Rd KW20) 
system II carrier protein (brnQ) 
[Haemophilus influenzae Rd] 


I 

7.2 I 


1 13 I 


A PH 1 07A\ 

r\ru l y /oj 


Homo sapiens O 
protein-coupled 
receptor kinase 1 and 
G protein-coupled 
receptor kinase lb 
(GRKl)gene, 
alternatively spliced, 
alternative exon 6, 
exon 7, and partial 
cds 


1.8 


498643 


(U10270) G-box binding factor 
1 [Zea mays] 


7.2 1 


114 1 


AF025967 


Helicobacter pylori 
J166 virulence 
regulon 
transcriptional 
activator homolog 
gene, partial cds, 
; train-specific 
zenomic sequence B2 


1.8 


( 
t 

3850IOR r 


AL033388) putative calcium- 
ransporting atpase 
Schizosaccharomyces Dombe] 


5.7 1 


115 1 


> 
( 

U13I83 c 


Cenopus laevis 
Xwnt-4) mRNA, 
omplete cds. 


1.8 


T 
I 
C 

( 

P 
> 

P 
h 

2494853 h 


J KOBABLE Z K 1 — 

IYDROXYACYLGLUTATHI 
)NE HYDROLASE 
GLYOXALASE II) (GLX II) 
rotein [Escherichia coli] 
gi|1786406 (AE000130) 
robable 

ydroxyacylglutathione 
ydrolase [Escherichia coli] | 


5.5 | 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


! Nearest 
ACCESSION 


Neighbor (BlastN vs. ( 
[ DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neieh 
ACCESSION 


oor fBlastX vs. Non-Redundant Pi 
DESCRIPTION 


•oteins) 
P VALUE 


116 


S68944 


Na+/Cl(-)-dependent 

neurotransmitter 

transporter 


1.8 


2276316 


(Z96810) GLYT-1 LIKE [Home 
sapiens] 


) 

5.5 


117 


M92905 


Rat calcium channel 
alpha- 1 subunit(rbB- 
I) mRNA, complete 
cds. 


1.8 


3165522 


(AF067607) Similar to cuticular 
collagen; C18H7.3 


5.5 


118 


X 12429 


Xenopus laevis Ul 
70K gene exon 10 


1.8 


2735957 


(AFO 15685) reverse 
transcriptase domain protein 


3.3 


119 


DS3333 


Mouse hepatitis virus 
genomic RNA for 
spike protein, partial 
cds 


1.8 


3876559 


yi^&iKjjz) jinuioiiiv lu numaii — 

cyclin A/CDK2-associatd 
protein P19 (RNA polymerase 
elongation factor) 
(SW:SKP INHUMAN); cDNA 
EST EMBL:T00114 comes 
from this gene; cDNA EST 
yk390fl 1.5 comes from this 
gene; cDNA EST yk402e 1 1 .5 

CO... 

>gi|38772 1 6|gnl|PID|e 1 346850 
protein P19 (RNA polymerase 
elongation factor) gene; cDNA 
EST yk390fl 1.5 comes from 
this gene; cDNA EST 
yk402ell.5 co... 


3.3 


120 


AFO 16972 


Cervus elaphus 
REDDEER 
mitochondrial D- 
loop, complete 
sequence 


1.8 


3878057 


(Z99942) similar to von 
Willebrand factor type A 
domain; cDNA EST yk412d4.5 
comes from this gene; cDNA 
EST yk4I2d4.3 comes from this 
gene 


3.2 


121 


AB010741 < 


Dncorhynchus mykiss 
nRNA for rtSox24, 
complete cds 


1.8 


( 

1730805 


PROTEIN IN RPS3-PSD1 
INTERGENIC REGION 
>gi|2132762|pir||S63129 
probable membrane protein 
i iNi-, i /h\v - yeast 
;Saccharomyces cerevisiae) 
>gi|1302152|gnl|PID|e23954S 
771451) ORF YNL174w 
Saccharomyces cerevisiae] 


2.5 


122 


J 

i 

U32844 c 


-laemophilus 
nfluenzae Rd section 
159 of 163 of the 
romplete genome 


1.8 


i 
I 

I 

72S910 


<\-TYPE INCLUSION 
PROTEIN (ATI) camelpox 
/irus >gi|623Sl (X69774) 
54kDa A-type inclusion protein 
unidentified] 


1.9 
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SEQ 
ID 


Neares 
ACCESSIOI 


t Neiehbor (BlastN vs. 
V DESCRIPTION 


Genbank) 
| P VALUE 


| Nearest Neisr 
1 ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Voteins) i 
P VALUE 


123 1 U18321 


Human ionizing 
radiation resistance 
conferring protein 
mRNA. complete cd. 


s. 1.8 


1 2133273 


ribosomal protein YS7 homolog 
Emericella nidulans 


r 1 

1.4 5 


124 1 M28668 


Human cystic flbrosi 
mRNA, encoding a 
presumed 
transmembrane 
conductance regulato 

V^-T Ll\). J> . . 

gb|Il 1500(111500 
Sequence 1 from 
Patent US 5407796 


s 
r 

1.8 


1 90492 


filaggrin precursor - mouse 
(fragment) 


0.87 1 


125 1 AF064553 


Mus musculus NSD1 
protein mRNA, 
complete cds 


1.8 i 


2501207 


PROBABLE PROTEIN 
DISULFIDE ISOMERASE P5 
PRECURSOR >gi| 1065461 
(U4041 1) Similar to protein 
disulfide-isomerase. 
[Caenorhabditis elesansl 


0.87 I 


126 I AB002314 


Human mRNA for 
KIAA0316gene, 
complete cds 


1.8 1 


115131 


REGULATOR V PROTEIN 
BRLA (BRISTLE A PROTEIN) 
>gi|837l8|pir||A28913 
regulatory protein brlA - 
Emericella nidulans >gi|168029 
(M20631)brlA protein 


0.84 


1 

1 ] 


Homo sapiens 
(subclone 10__d2 from 
PI H21) DNA 
sequence. 


1.8 1 


2135624 


metalloproteinase 1 (EC 3.4.24.- 
- human 


0.65 I 


1 ~ ] 

128 M3727S 


R.norvegicus renin 
jene, exons 1-9. 


1.8 


< 

40500S7 s 


AF 109907) SI 64 [Homo 
apiens] 


058 J 


1 I 

129 1 ' X82879 c 


\rtificial sequences 
3NA for ART 2 
onsensus 


1.8 | 


( 
1 

310929 t 


LI 3442) cysteine-rich extensin- 
ike protein-4 [Nicotiana 
abacum] 


0.52 1 


1 f 

130 1 D89729 c 


lomo sapiens mRNA 
or CRM 1 protein, 
omplete cds 


1.8 \ 


( 

3559944 r 


AJ0 10792) Muc5AC protein 
VIus musculus] 


0 38 1 


1 h 

1 s< 

131 1 U7S076 q 


[us musculus 
^piapterin reductase 
ene, exons 1 and 2 f 


1.8 ; 


0 
p 

2984225 a < 


<\E000766) enolase- 
hosphatase E-l [Aquifex 
solicus] 


0.095 1 
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PCT/US00/18374 



SEC 
ID 


Neares 

ACCESSIOI 


t_ Neighbor (BlastN vs. Gcnbank) 
V DESCRIPTION | p VALUE 


1 Nearest Neigh 
ACCESSION 


lbor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION P vat tfp 


132 


1 V<N^ I IT. 


Paramecium I68G 
gene for 168G 
surface protein 


! 1.8 1 


115316 


COLLAGEN ALPHA 1(VHI7" 

CHAIN PRECURSOR 

(ENDOTHELIAL 
COLLAGEN) 

>gi|105686|pir||S 15435 collager 
alpha l(VIII) chain precursor - 


i 

0.073 


133 


I M77830 


Human desmoplakin 
mRNA, complete cds 


[ I 

1.8 1 


1397246 


(1161^44) coded lor by C. 
elegans cDNA ykll2f3.5; code< 
for by C. elegans cDNA 
cm21d2; coded for by C. 
elegans cDNA CEESR07F; 
coded for by C. elegans cDNA 
Ykl 12f3 3' coded fnr hv C 
elegans cDNA CEESR29F 
[Caenorhabditis ele^ansl 


i 

le-04 j 


134 


AJ224150 


Plasmodium berghei 
EF-1 alpha A-gene 


1.8 | 


1353761 


( U43 192^ rnvn^in TT h*»iv\/ r»hii« 
[Naegleria fowleri] 


2e-05 


135 


AJ0055I8 


Mus musculus 
somatostatin receptor 
2 gene, exonl and 5" 
flanking region 


1 1.8 J 


1326350 


(U58748) similar to potential 
transmembrane domains in S. 
cerevisiae nulcear division 
RFT1 nrotein (" < sP P^R9n^ 


o„ no 1 


136 1 


fWWzZ i / 


Ralstonia eutropha 
megaplasmid pHGl 
nitric oxide reductase 
(norB) gene, 
complete cds 


1.8 I 


3393018 


(AL03 1 174") hvnnfhpfiml 
protein 


ze-Uo 1 
2e-0S I 


137 1 


AF039035 


Caenorhabditis ! 
elegans cosmid 
C53A3 I 


18 J 


3S50109 


(AL03338S) 3-oxoacyI-[acyI- 
carrier-protein]-synthase 


3e- 1 1 j 


138 I 


i 
r 

c 

M81769 r 


S.domesticus 
mi iiLiiiu muDUiin 1 
e arranged gamma 
•hain mRNA, VJC j 
egion. complete cds. 


18 1 


( 
F 

30S0527 


AL022600) putative mannose-1 
phosphate gaunyl transferase 
Schizosaccharomyces pombe] 


3e-14 1 


139 1 


Y11106 F 


-pastoris PYCl sene 


1.8 


X 

r 
c 

1175412 ( 


■i YPOTHET1CAL 24.2 KD 
> ROTEINC13A11.03 IN 
rHROMOSOME I >gi|9S4224 
Z54096) unknown 


le 15 1 


140 1 


C 
d 
k 
fl 

U87803 p 


luman putative 
, a2+/calmodu!in- 
ependent protein 
inase kinase gene. 3' 
anking region, 
artial sequence 


1.8 1 


C 
> 

2S2S280 . \i 


AL0216S7) putative protein 
<\rabidopsis thaliana] 
gi|2S32633|gnl|PID|e 1249651 
AJL02171 1) putative protein 
\rabidopsis thaliana] 


3e-17 1 



^ r, 
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SEC 
ID 


| Neares 
ACCESSIOI 


t Neighbor (BlastN vs. 

V DESCRIPTION 
Plasmodium 


Genbank) 
P VALUE 


S Nearest Neiah 
1 ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Voteins) [ 
P VALUE 


141 


1 AE001430 


falciparum 
chromosome 2, 
section 67 of 73 of 
the complete 
_ sequence 


1.8 


1 1931647 


(U95973) endomembrane 
1 P. r P te j" EIVtP7Q P recusor i s °l°g 


2e-20 


142 


1 LI 9708 


ixciL iN-nieinyi-j-j- 
aspartate receptor 
(NMD AR1) gene, 
first exon. 


1.8 


1 1731181 


" kYh'Ui i-Lbl iCJAL vO.:> KJJ 
PROTEIN C14A4.3 IN 
CHROMOSOME II 
>gi|3874230|gnl|PED|e 135 1618 
protein (Swiss Prot accession 
number P38376); cDNA EST 
yk220el0.5 comes from this 
gene [Caenorhabditis ele<zansl 


3e-21 


143 


Y 10728 


P.schwarzi 
mitochondrial cytb 
gene, partial 


1.8 


( 3878644 


(Z8I103) predicted using 
Genefinder; cDNA EST 
yk303gll.5 comes from this 
gene; cDNA EST yk303gl 1.3 
comes from this gene 
[Caenorhabditis elegansl 


le-28 


144 


AB006631 


Homo sapiens mRNA 
for KIAA0293 gene, 
partial cds 


1.8 


4176500 


(AL031177) dJ889M15.3 (novel 
protein) 


7e-45 


145 


AF106967 


Mus musculus 13 
protein mRNA, 
complete cds 


L7 


<NONE> 


' <NONE> 


<NONE> 


146 1 


AE001073 


Archaeoglobus 
fulgidus section 34 of 
172 of the complete 
genome 


1.7 j 


<NONE> 


<NONE> 


<NONE> 


147 1 


J 
I 
c 
F 

a 

P 
c 
h 

U12977 c 


rseuaomonas 
emoignei poly(3- 
lydroxybutyrate) 
iepolymerase A 
precursor (phaZ5) 
>ene, complete cds, 
nd gIycerol-3- 
)hosphate- 
lehydrogenase 
omolog, complete 
ds. 


1.7 1 


<NONE> 


<NONE> 


<NONE> 


148 | 


o 

M27038 2 


4us musculus 
SK/CamRk) 
ermline IgK chain 
ene, J 1-5 region. 


1.7 j 


<NONE> 


<NONE> 


cNONE> 
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1 Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






H.sapiens HBF- 1 










149 


X74142 


mRNA for 
transcription factor 


1.7 


<NONE> 


<NONE> 


<NONE> 


150 


U40830 


streptococcus 
thermophilus DeoD 
gene, partial cds and 
EpsA, EpsB, EpsC, 
EpsD, EpsE, EpsF, 
EpsG, EpsH, EpsI, 
EpsJ, EpsK, EpsL, 
EpsM, Orfl4.9 
protein genes, 
complete cds 


1,7 


<NONE> 


<NONE> 


<NONE> 


151 


L29172 


Rabbit Ig germline 
gamma H-chain 
(allotype dl2,el5) C- 
region gene, 3' end. 


1.7 


<NONE> 


<NONE> 


<NONE> 


152 


Ml 9045 


Human lysozyme 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


153 


AE001 159 


Borrelia burgdorferi 
(section 45 of 70) of 
the complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


154 


L 17027 


Plasmid pFdA (from 
Fremyella 
diplosiphon) DNA 
sequence, including 
unidentified cds and 
stem loop. 


1.7 


<NONE> 


<NONE> 


<NONE> 


155 


U12232 


Arabidopsis thaliana 
Columbia GTP 
binding protein beta 
subunit (AGB1) 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


156 


D42056 


Arabidopsis thaliana 
ATPK6 mRNA for 
ribosomal-protein S6 
kinase homolog, 
complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


157 


X98117 


Rhizobium 
leguminosarum prsD, 
prsE, ORF3 genes 


1.7 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proreinsi 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















158 


AF039084 


Spinacia oleracea 
heat shock 70 protein 
protein, complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


159 


Z 12651 


R.norvegicus gene for 
catechol 

meth y Itransferase 


1.7 


<NONE> 


<NONE> 


<NONE> 


160 


AF002968 


Fringilla coelebs 
mitochondrial control 
region, partial 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


161 


AE001160 


Borrelia burgdorferi 
(section 46 of 70) of 
the complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


162 


U67553 


Methanococcus 
jannaschii section 95 
of 150 of the 
complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


163 


M86247 


S.ruminantium 
plasmid pS23 DNA. 


1.7 


<NONE> 


<NONE> 


<NONE> 


164 


S74436 


oIL-8=interIeukin'S 
[sheep, spleen cells, 
mRNA, 1435 nt] 


1.7 


<NONE> 


<NONE> 


<NONE> 


165 


D12719 


Candida maltosa 
ALK7 (CYP52A10) 
and ALKS complete 
cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


166 


U02625 


Geotrichum 
candidum NRRL Y- 
553 lipase gene, \ 
partial cds. 


1.7 


321245 


230k bullous pemphigoid 
antigen BPM1 - mouse 


9.3 


167 


Z58881 


H.sapiens CpG DNA, 
clone 1 14a4, reverse 
read cpgl 14a4.rtla . 


1.7 


1854675 


(U66298) bone morphogenetic 
protein-6 [Rattus norveaicusl 


9.1 


168 


U43674 


Agrobacterium 
tumefaciens conjugal 
transfer region 1 
senes 


1.7 


1352066 


Large fkuline-kiCh 

PROTEIN BAT2 MHC class III 
histocompatibility antigen HLA- 
B-associated transcript 2 - 
human >gi|179339 (M33509) 
HLA-B-associated transcript 2 
(BAT2) [Homo sapiens] 
>gi| 179345 (M33518) HLA-B- 
associated transcript 2 (BAT2) 
Homo, sapiens] 


9.1 
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SEQ 
ID 


Nearest 
ACCESSION 


Neighbor (BlastN vs. ( 
I DESCRIPTION 


jenbank) 
P VALUE 


| Nearest Neish 
1 ACCESSION 


3or (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


169 
170 


AL023827 


Cacnorhabditis 
elegans cosmid 
Y12A6A, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


1 731440 


PRO I UFUKFH V R1NOUET3 — 

" OXIDASE (FFO) yeast 

(Saccharomyces cerevisiae) 
>gi|603606 (U18778) Heml4p; 
protoporphyrinogen oxidase 
[Saccharomyces cerevisiae] 
>gi|1403536|gnl|PID|e249333 
(Z71381) protoporphyrinogen 
oxidase [Saccharomyces 
cerevisiae] 


8.9 


X69662 


X,laevis mRNA for 
glutathione 
synthetase, large 
subunit 


1.7 


! 4038057 


(AC005897) hypothetical 
protein [Arabidopsis thaliana] 


8.8 


171 


Z35824 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBL063w 


1.7 


3021450 


(Y15515) prdl-a [Hydra 
vulgaris] 


7.0 


172 


M65139 


Cowpea chlorotic 
mottle virus fffMVl 
la protein gene, 
complete cds. 


1.7 1 


2506307 


COLLAGEN ALPHA 1(X11) 
CHAIN PRECURSOR I (XII) 
chain - chicken 

>gi|222811|gnl|PID|di001160 

gallus] 

>gi|2326442|gnl|PID|e39435 
(X61024) collagen type XII 
alpha 1 chain [Gallus aallusl ! 


7.0 


173 


] 

( 
c 

XI 5065 r 


Drosophila distal BX- 
Z region (bithorax 
complex) pH189 5' 
egion; 


.1.7 1 


1 
1 

( 

1723625 r 


HYPOTHETICAL 10.0KD 

PROTEIN IN ALP A- GAB D 
NTERGENIC REGION (F87) 
>gi| 1033 124 (U36840) 
DRF_fS7 [Escherichia coli] 
>gi| 1788982 (AE000348) orf, 
typothetical protein 


6.9 



2>;s 
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SEQ 
ID 


Nearest 

) 

ACCESSIOI^ 


Neighbor (BlastN vs. < 
4 DESCRIPTION 


Genbank) 
P VALUE 


1 Nearest Neish 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


rote ins) 1 
P VALUE 


174 


Z46255 


S.cerevisiae 
chromosome VI 
lambda clone. 


1.7 


I 3875228 


(Z46792) similar to lethal(l) 
discs large- 1 tumor suppressor 
protein-like repeats; cDNA EST 
EMBL:D33495 comes from this 
gene; cDNA EST 
EMBL:D351 17 comes from this 
gene; cDNA EST 
EMBL:D36356 comes from this 
gene; cDNA EST EMB... 
>gi|3879984|gnl|PID|el35i767 
suppressor protein-like repeats; 
cDNA EST EMBL.D33495 
comes from this gene; cDNA 
EST EMBL:D351 17 comes 
from this gene; cDNA EST 
EMBL:D36356 comes from this 
gene; cDNA EST EMB... 


6.7 J 


175 


U01066 


Human CD4 
promoter, partial 
sequence. 


1.7 


125448 


THYMIDINE KINASE 
saimiriine herpesvirus 1 (strain 
ll[Onc]) >si|60341 


67 


176 


U34743 


Phalaenopsis sp. 
'hybrid SM9108' 
homeobox protein 
mRNA, complete cds 


1.7 


1022918 


(U38184) ATPase subunit 6 
[Trypanosoma cruzi] 


6.7 J 


177 


U 14662 


Baboon herpesvirus 
HVP2gB 

glycoprotein (UL27) 
gene, complete cds. 


1.7 1 


3218378 


(AL023862) hypothetical 
protein SC3F9.07 [Streptomyces 
:oelicolorl 


6.7 J 


178 


] 

ABO 17006 i 


ftomo sapiens 
PMS2L15 mRNA, 
martial cds 


1-7 I 1 


( 

1465855 [ 


TJ64859) glutamine-rich protein 
Caenorhabditis elesans] 


6.7 J 


179 


I 
1 

i 
t 

U92651 c 


3rassica oleracea var. 
^otrytis tonoplast 
ntrinsic protein 
>obTIP26-l mRNA, 
omplete cds 


1.7 II 


I 
( 
r 

3023675 r 


DYNEIN HEAVY CHAIN, 
ZYTOSOLIC (DYHC) dynein 
leavy chain 

Sehizosaecharomyces pombe] 


66 


180 


I 

n 

AF000634 n 


-ytechinus variegatus 
otch homolog 
iRNA, complete cds 


1.7 1 


( 

1 ° 

148574 \l 


M58520) endo-l,4-beta- 
lucanase [Ftbrobacter 
uccinogenes] 


6.6 j 
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SEQ 
ID 


Nearest 

A /""« COOT /~\ X. 


Neighbor TBlastN vs. < 
{ DESCRIPTION 


3enbank) 
P VALUE 


f Nearest Neieh 
ACCESSION 


x>r (BlastX vs. Non-Redundant P 
| DESCRIPTION 


roteins) 
P VALUE 


181 


I M92354 


Arabidopsis thaliana 
anthranilate synthase 
alpha subunit gene, 
complete cds. 


1.7 


738308 


blue light photoreceptor 
[Arabidopsis thaliana] 


6.5 


182 


J AJ234856 


Hordeum vuigare 
genomic DNA 
fragment; clone 
MWG2234.rev 


1.7 


3142302 


KAC002411) Strong similarity tc 
myosin heavy chain gb|Z34293 
from A. thaliana. [Arabidopsis 

[thaliana] 


6.5 


183 


1 U76827 


Stercorarius 
parasiticus bird J33 
cytochrome b protein, 
partial cds 


! 1.7 


3413810 


(Y17034) Bassoon [Mus 
musculus] 


5.4 


184 


U05211 


Saccharomyces 
cerevisiae Ttplp 
(TTP1) gene, 
complete cds. 


1.7 


403173 


(L24492) lipoprotein 
[Rhodococcus erythropolis] 


4.9 


185 


AF076974 


Homo sapiens 
TRRAP protein 
(TRRAP) mRNA, 
complete cds 


1.7 


1170140 


PUTATIVE 

ENDOGLUCANASE TYPE K 
PRECURSOR (ENDO-1,4- 
BETA-GLUCANASE) 
(CELLULASE) 


4.1 


186 1 


AE000753 


Aquifex aeolicus 
section 85 of 109 of 
the complete genome 


1.7 


1169357 


DNA ADENINE METHYLASE 
site-specific DNA- 
methyltransferase (adenine- 
specific) dam methylase gene 
product [Vibrio cholerae] 


4.0 


187 I 


AF005638 


Tupaia glis 
apolipoprotein AI 
prepropeptide 
mRNA, complete cds 


1.7 


1 

3355682 


(AL031 124) putative secreted 
yase 


4.0 


188 | 


< 


Human germline IgK 
"hain gene V3-region, 
-lone numKvjzcnj 


1.7 ! 


< 

2257483 [ 


AB004534) pi003 
Schizosaccharomyces pombe] 


4.0 


189 I 


I 

c 

M24001 c 


vlink enteritis virus 
intigenic type 2 
:apsid protein genes 
/PI and VP2, j 
'omplete cds. 


1.7 


i 
r 

r 

s 

[ 

2143504 U 


nyotonic dystrophy kinase - 
nouse (fragment) kinase, DM- 
kinase {C-terminal, alternatively 
pliced, clone delta IIJIIJV.VJ 
mice, brain, Peptide Partial, 
74 aa] [Mus sp.] 


3.9 


190 1 


I 

X59964 t 


i. sapiens CST4 gene 
or Cystatin D 


1.7 


\( 

1766075 |c 


U37273) winged helix protein 
:WH-2 [Gallus aallus] 


3.1 
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ID 


Nearest 
ACCESSIOrs 


Neighbor (BlastN vs. ( 
[ DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neiph 
ACCESSION 


Dor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


191 


X95276 


P. falciparum 
complete gene map 0 
plastid-like DNA (IR- 
B) 


f 

1.7 


3219951 


HYPOTHETICAL 11.7 KD 
PROTEIN C6B 12. 13 IN 
CHROMOSOME I 
>gi|2330843|gnl|PID|e334047 
pombe] 


3.0 


192 


D84487 


Rat PMSG-induced 
ovarian mRNA, 
3'sequence, N10 


1.7 


173164 


(J02719) valyl-tRNA synthetase 
[Saccharomyces cerevisiae] 


2.3 


193 


L14851 


Rattus norvegicus 
neurexin Ill-alpha 
gene, complete cds. 


1.7 


3323586 


(AF060869) single-strand 
binding protein [Salmonella 
typhimurium] 


2.3 


194 


M97002 


Xenopus laevis/gilli 
hybrid pseudo-IgH 
chain gene, V region, 
clone LG7G342A. 


1.7 


2118407 


MHC sex-limited protein - 
mouse (fragment) musculus] 


2.3 


195 


L07025 


tiauiuus inunngiensir 

delta-endotoxin 
(CryA(a)) gene, 5' 
end. > :: 

gDjlj54ozU|lj452U 

Sequence 1 from 
patent US 5596071 > 
::gb|I39790|I39790 
Sequence 1 from 
patent US 5616495 > 

gb|AR00S4S7|AR008 
487 Sequence 1 from 
patent US 5753492 


1.7 


2496940 


HYPOTHETICAL 53.4 KD 
PROTEIN D 1054. 13 IN 
CHROMOSOME V 
>gi|3875316|gnl|PID|el344967 


1.8 


196 


■ 

C7-3 t AQ 


insulin-like growth 
factor II {intron 7} 
'human, Genomic, 
1 /uz ntj 


1.7 


3327038 


(AB0145 12) KIAA0612 protein 
'Homo sapiens] 


1.8 


197 


I 
I 

i 

D86990 c 


-luman (lambda) 
3NA for 

mmunoglobulin light 
hain 


1.7 


] 
( 

( 

1 
I 

494367 ( 


Fv Fragment (Murine Se 155-4) 
Complex With The 
[Yisaccharide: Alpha-D- 
3alactose( l-2)[alpha-D- 
\bequose(l-3)]alpha- D- 
vlannose (Pl-Ome) (Part Of 
rhe Cell-Surface Carbohydrate 
Df Pathogenic Salmonella) 


1.8 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION ! 


P VALUE 






Pi asm id pFdA (from 










198 


LI 7027 


Fremyella 
diplosiphon) DNA 
sequence, including 
unidentified cds and 
stem loop. 


1.7 


1082702 


poliovirus receptor-related 
protein - human 


1.4 


199 


AL022273 


Caenorhabditis 
elegans cosmid 
H22D14, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


3924605 


(AF069442) putative inhibitor 
of apoptosis [Arabidopsis 
thaliana] 


1.4 


200 


U89926 


Drosophila 
melanogaster cut 
gene, partial sequence 


1.7 


2245100 


(Z97343) DNA-binding protein 
homolos 


1.3 


201 


Z25749 


H. sapiens gene for 
ribosomal protein S7 


1.7 


2493459 


PROTEIN KINASE C 
SUBSTRATE, 60.1 KD 

nn/\' l i'l 'T\T Til' A X7V /""'T T A TXT 

PROTEIN, HEAVY CHAIN 
(PKCSH) (80K-H PROTEIN) 
><d| 1215746 


1.1 


202 


U59841 


Fundulus heteroclitus 
lactate dehydrogenase 
B 


1.7 


3005587 


(AF048977) Ser/Arg-related 
nuclear matrix protein [Homo 
sapiens] 


0.82 


203 


X55763 


Rabbit mRNA tor 
smooth muscle 
calcium channel 
blocker (CaCB) 
receptor 


1.7 


3883128 


(AF082302) arabinogalactan- 
protein [Arabidopsis thaliana] 


0.82 


204 


Z75528 


Caenorhabditis 
elegans cosmid 
C18BL2A, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


940397 


(D 10123) core [Hepatitis C 
virus] 


o.so I 


One 


U50912 


Human XIST gene, 
poly purine- 
pyrimidine repeat 
region 


1.7 


2338027 


(AF005370) large tegument 
protein [.-Aiceiapmne nerpcsvirub 
1] 


0.59 


206 


X12817 


Ovis aries beta- 
lactoglobulin gene 


1.7 


987050 


(X65335) IacZ gene product 
[unidentified cloning vector] 


0.45 


207 


AF004419 


Homo sapiens 
troponin T (TNNT2) 
2ene, exon 13 


1.7 


2996364 


(AF053947) unknown [Yersinia 
pestis] >gi|3883090 


0.22 


203 


L43643 


Gallus domesticus 
DNA rnicrosatellite 
marker MCW119 


1.7 


464S96 


TRANS DUCIN-LIKE 
ENHANCER PROTEIN 1 
enhancer-of-split homolog TLE- 
1 - human >si|307510 


0.20 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















209 


Z73278 


S.cerevisiae 

rhrAmocnmp T T 

CI 11 Ul HUiUl 1 1C A.11 

reading frame ORF 
YLR I06c 


1.7 


1351657 


HYPOTHETICAL m.VKD 
PROTEIN C30D1L04C EN 
CHROMOSOME I 
>gi|2130411|pir||S62562 
hypothetical protein 
SPAC10D1 1 4c - fission veast 
nuclear pore complex protein 
[Schizosaccharomyces pombel 


0.20 


210 


M22345 


A/foi i cp pnHnopnniiQ 

provirus gag, pol, and 
env region DNA. 


1.7 


2444455 


(AF020765) hypothetical 
protein [Myxococcus xanthus] 


0.12 


211 


/VbUUUJuU 


Escherichia coli K-12 
MG1655 section 250 
of 400 of the 
complete genome 


1 7 


Z / jOjD 1 


(AF039038) No definition line 
rounci [k_,aernjrn£iDUiiib eic^anibj 




212 


AB020692 


Homo sapiens mRNA 
for KIAA0885 
protein, complete cds 


1.7 


2605924 


(AF029726) histidine kinase C 
[Dictyostelium discoideum] 


0.094 


213 


^ AQ/I7Q 


testis-determining 
c^nf'/SRY hnmoloo 
[Sminthopsis 
macroura=striped- 
faced dunnarts, 
Genomic, odd ntj 


1 7 


7dQQfi 1 f\ 


TONB PROTEIN >gt| 16665 36 
(U23764) TonB [Pseudomonas 

^ Aft i rti n r^i c o 1 
aCl UginUou J 


0.092 


214 


QAQ47Q 


testis-determining 
gene/SRY homolog 
[Sminthopsis 
macroura=striped- 
faced dunnarts, 
Genomic, 855 nt] 


1 7 
1 . / 


OAQQCi 1 A 


TONB PROTEIN >gi| 1666536 
(U23764) TonB [Pseudomonas 


0.088 


215 


U67205 


Mus musculus ACF7 
neural isoform 3 
(mACF7) mRNA, 


1.7 


2047349 


(AF000198) weak similarity to 
HSP90 [Caenorhabditis elegans] 


0.052 


216 


X98188 


Artificial DNA 
sequence for 
mammalian lambda- 
neo minichromosome, 
1400 bp 


1.7 


2493779 


PUTATIVE CUTICLE 
COLLAGEN C09G5.6 
collagen; cDNA EST yk244c3.5 
comes from this gene; cDNA 
EST yk244e3.3 comes from this 
gene [Caenorhabditis elegans] 


0.042 ! 


217 


U70139 


Mus musculus 
putative CCR4 
protein mRNA, 
partial cds 


1.7 


2252630 


(U95973) hypothetical protein 
[Arabidopsts thaliana] 


0.041 



i%3 
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SEC 
ID 


Nearest 

) 

ACCESSION 


Neighbor (BlastN vs. 
4 DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 

ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 






Homo sapiens alpha- 
i type V collagen 
(COL5Al)gene, 5' 
flank and exon 1. 


1.7 


1 2895760 


(AF045246) universal rninicircl 
sequence binding protein 
minicircle sequence binding 
protein fCrithidia fasciculatal 


e 

0.039 


219 


Z72151 


B.napus mRNA for 
AMP-binding protein 


1.7 


1 190475 


(K02576) salivary proline-rich 
protein 1 [Homo sapiens] 


0.011 


220 


X94152 


R.norvegicus mRNA 
for cysteine suifinate 
decarboxylase 


1.7 


2136212 


synapsin Ub - human 

>gi| 1594277 (U402I5) synapsin 

lib [Homo sapiens] 


0.008 


22 L 


L20255 


Mouse stathmin gene 
sequence. 


1.7' 


I 2317934 


(U97553) unknown [murine 
herpesvirus 68] 


0.006 1 


222 


L13600 


Rattus norvegicus 
glycine transporter 
mRNA, complete cds. 


1.7 


726403 


(U23175) similar to anion 
exchange protein 
[Caenorhabditis elegans] 


0.003 


223 


AJ224150 


Plasmodium berghei 
EF-1 alpha A- gene 


1.7 


2072290 


(U95094) XL-INCENP 
[Xenopus laevis] 


0.001 


224 


S 80642 


butyrophilin [mice, 
lactating mammary 
gland, mRNA Partial, 
3193 nt] 


1.7 


2695746 


(AJ223010)Pmt2 
Schizosaccharomyces pombe] 


9e-Q4 1 


225 


M22363 


Celegans unc-86 
gene encoding two 
alternative proteins, 
complete cds. 


1-7 1 


2224683 


(AB002369) KIAA0371 [Homo 
sapiens] 


le 04 I 


226 


X92123 


M.musculus cgt gene 
txon 1 


1-7 


3874232 


(Z49909) similar to Prokaryotic 
ribonuclease PH 

'Caenorhabditis elegansl 


3e-05 ( 


227 


] 
( 

ABO 16000 r 


fpomoea nil PKn2 
knotted-Iike gene) 
nRNA, complete cds 


1.7 j! 


( 

2183083 


AJrUU0422) TTF-I interacting 
peptide 5 [Homo sapiens] 


le-05 


228 


E 

D14L33 s 


Jovine mRNA for 
ynaptocanalin I 


1.7 ! 


I 
F 

y 
g 

c 

3925277 f( 


ALu3^643j similar to 
Jncharacterized protein family 
JPF0034, Double-stranded 
IN A binding motif; cDNA EST 
k489b3.5 comes from this 
ene; cDNA EST yk439g7.5 
omes from this gene 
Caenorhabditis elesans] 


2e-06 I 



^ ^ 1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















229 


L01991 


Mus musculus TAFG- 
1-like neuronal 
glycoprotein (PCS) 
mRNA, complete cds. 


1.7 


3006139 


(AL022299) hypothetical 
protein 


4e-07 


230 


X63016 


Tomato vellow leaf 
curl virus Thailand 
isolate complete 
genome (TYLCV-TH 
B-DNA) 


1.7 


3643608 


(AC005395) hypothetical 
protein [Arabidopsis thaliana] 


le-07 


231 


Z22802 


H. sapiens 

microsatellite repeat. 

> :: 

gb|G34562|G34562 
human STS SHGC- 
51834 


1.7 


100210 


extensin precursor (clone Tom L 
4) - tomato esculentum] 


4e-09 


232 


K02765 


Human complement 
component C3 

beta subunits, 
complete cds. 


1.7 


2984320 


(AE000773) acetoin utilization 
protein [Aquifex aeolicus] 


le-09 


233 


£-> I HO I O 


S.cerevisiae 
chromosome XV 
reading frame ORF 

I ULU / DW 


1 7 


3o7J7UU 


^73107) Dredicted usino 
Genefinder; Similarity to 
Bacillus subtilis DNAJ protein 
gene; cDNA EST 
EMBL:C 12520 comes from this 
gene; cDNA EST 
EMBL:D7I409 comes from this 
ge... 


7e-l 1 


234 


D21871 


Pig mRNA for thimet 
oligopeptidase 


1.7 


2632098 


(Y 155 13) Prodos protein 
Drosophila melanosaster] 


8e-13 


235 


Y 14344 


Gallus gallus gene 
encoding neurofascin, 
exons 9,10,11 & 12 


1.7 


3876421 


(Z&ltyfO) cDNAKST 
EMBL:C 12730 comes from this 
gene; cDNA EST yk200b6.5 
comes from this gene; cDNA 
EST yk349al2.5 comes from 
this gene [Caenorhabditis 
slegansl 


3e-14 


236 


Z73608 


S.cerevisiae 
:hromosome XVI 
reading frame ORF 
YPL252c 


1.7 


1 

1439663 < 


IU64605) C05D9.6 gene 
product [Caenorhabditis 
-legans] 


6c- 18 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) | 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 




P VALUE 












OLIGOSACCHARVL 




237 


AG000518 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
T171N23 


1.7 


1 174468 


TRANSFERASE STT3 
SUBUNIT HOMOLOG 
>gi|529357 (U13019) No 
definition line found 
[Caenorhabditis elegans] 


6e-18 


238 


D17716 


Human mRNA for N- 
acetylglucosaminyltra 
nsferase V T complete 
cds 


1.7 


961446 


(D63877) KIAA0157 gene 
product is novel. 


5e-19 


239 


AF102512 


Cheilodactylus 
vittatus country USA: 
Midway Island 
cytochrome c oxidase 
subunit I gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


1.7 


1572756 


(U70848) C43G2.1 gene 
product [Caenorhabditis 
elegans] 


5e-40 


240 


L30107 


Rattus norvegicus 
liver-specific 
transporter gene, 
promoter region. 


1.7 


4176443 


(AL022238) dJ1042K10.4 
(novel protein) 


3e-49 


241 


X91220 


H. sapiens mRNA for 
Na-Cl electroneutral 
thiazide-sensitive 
cotransporter 


1.7 


3478637 


(AC005546) R29425_l [Homo 
sapiens] 


6e-54 j 


242 


U97146 


Rattus norvegicus 
calcium-independent 
phospholipase A2 
mRNA, complete cds 


1.6 


<NONE> 


<NONE> 


<NONE> 


243 


Z48508 


Pea seed -borne 
mosaic virus RNA for 
coat protein and 
polymerase (partial) 


1.6 


<NONE> 


<NONE> 


<NONE> 


244 


M1S349 


Rat leukocyte 
common antigen (L- 
CA) gene, exons 1 
throuuh 5. 


1.6 


<NONE> 


<NONE> 


<NONE> 


245 


Ml 31 58 


Yeast (S.pombe) 
cdc25+ gene (mitosis 
initiation), complete 
cds. 


1.6 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor CBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






\jj \rrr\r\ 1 'it m 'i 

ivi y cup iaol llu 










246 


U39712 


genitalium section 34 
or d i oi ine complete 
genome 


1.6 


<NONE> 


<NONE> 


<NONE> 


247 


Ml 7922 


urokinase-type 
plasminogen activator 
protein gene, 
complete cds. 


1.6 


3875750 


(Z81499) predicted using 

vjCilCLliltlcr, k,J_/l> r\ Col 

yk410e3.3 comes from this 
gene; cDNA EST yk410e3.5 
comes from this gene 
(Caenorhabditis elegans] 


8.0 


248 


M89986 


loci in Xq28. 


1.6 


3261710 


^o4/z4; psu [iviycooactenum 
tuberculosis] 


6.4 


249 


M89986 


Human polymorphic 
loci in Xq28. 


1.6 


2143805 


inositol-polyphosphate 4- 
phosphatase - rat 


6.2 


250 


U DO / 


Rattus norvegicus 
Deleted in colorectal 
Cancer 


1.0 


1 ^<AQC\A 

12jOoU4 


(U51449) RING3 protein 
[Xenopus laevis] 


5.8 


251 


X95199 


P.platessa GSTA, 
vjj i rvi , uj i ana 
PPTN genes 


1.6 


3915113 


MALEYLACETATE 
REDUCTASE Pseudomonas 
cepacia >gi|643636 (U19883) 
maleylacetate reductase 
[Burkholderia cepacia] 


4.9 


252 


Y09103 


D.melanogaster 
RPA1 gene 


1.6 


3916021 


HYPOTHETICAL 91 KD 
PROTEIN IN COB INTRON 
>gi|2654230|gnl|PID|e 1 1 92341 
(X02819) unidentified reading 
frame [Schizosaccharomyces 
pombe] 


4.8 


253 


214078 


1 -UCMI V U1I1 

mitochondrion fMet, 
18S, 5S repeat unit 
DNA 


1.6 


2501668 


DYSTROPHIN- RELATED 
PROTEIN 2 sapiens] 


3.6 


254 


AB002314 


Human mRNA for 
KIAA0316gene, 
complete cds 


1.6 


130997 


REPETITIVE PROLINE-RICH 
CELL WALL PROTEIN 1 
PRECURSOR 

>gi|8lS09|piri|A29324 proline- 
rich protein precursor - soybean 
>gi| 170049 (J02746) proline- 
rich protein [Glvcine max] 


2.8 


255 


M2L488 


Human muscle 
creatine kinase gene 
(CKMM). exon 2. 


1.6 


119399 


ENV POL YPROTEIN 
PRECURSOR (COAT 
POL YPROTEIN) [CONTAINS: 
COAT PROTEIN GP62; COAT 
PROTEIN GP40] 


2.2 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 

LLJ 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















256 


AE001164 


Borrelia burgdorferi 
(section 50 of 70) of 
the complete genome 


1.6 


4050089 


(AF 109907) hypothetical 
protein [Homo sapiens] 


1.5 


257 


X61757 


M.musculus 
rearranged T-cell 
receptor beta variable 
region (Vbl7a) 


1.6 


3377766 


(AF080090) semaphorin IV 
isoform b [Mus musculus] 


1.2 


258 


M15346 


T.cruzi tandemly 
repeated gene 
encoding an 85 kDa 
antigen with 
homology to heat 
shock proteins. 


L6 


2804437 


(AF043695) similar to zinc 
metallopro tease family of 
peptidases [Caenorhabditis 
elegansl 


0.41 


259 


L39018 


Rattus norvegicus 
sodium channel 
protein 6 (SCP6) 
mRNA, complete cds 


1.6 


2920535 


(AF01 8081) type XVIII 
collagen [Homo sapiens] 


0.037 


260 


M29483 


Human leukocyte 
adhesion protein 
p 150,95 alpha subunit 
gene, exons 7 - 15. 


1.6 


1840045 


(U49082) transporter protein 
[Homo sapiens] 


2e-09 


261 


L06844 


Aspergillus niger beta 
D- fructofuranosidase 
(sucl) gene, one 
exon. 


1.6 


4206210 


(AF071527) putative calcium 
channel [Arabidopsis thaliana] 


9e-10 


262 


M 10946 


Chicken aldolase B 
gene, complete cds, 
clones lambda- 
C(11.L4). 


1.6 


2746775 


(AF040640) similar to peptidase 
family C19 (ubiquitin carboxyl- 
terminal hydrolase) 
[Caenorhabditis elegans] 


le-31 


263 


X07881 


Human gene PRB3L 
for proline-rich 
protein Gl 


1.5 


<NONE> 


<NONE> 


<NONE> 


264 


U22260 


Nicotiana tabacum 
UMP synthase (pyr5- 
6) mRNA, partial cds 


1.5 


3880923 


(Z99271) similar to Reverse 
transcriptase comes from this 
gene [Caenorhabditis elegans] 


0.50 I 


265 


U76759 


Mus musculus 
nuclear protein 
NIP45 mRNA, 
complete cds 


1.4 


1330394 


(U5S761) C01F1.6 gene product 
[Caenorhabditis elegans] 


8.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


! ACCESSION 


DESCRIPTION 


P VALUE 












PU1ASMUM- 




266 


AF076470 


Rice tungro 
bacilliform virus 
Serdang strain, 
complete genome 


1.4 


S 1703461 


TU A PONTINE 1 A IPA^ 

BETA CHAIN (PROTON 
PUMP) (GASTRIC H+/K+ 
ATPASE BETA SUBUNIT) 
3.6.1.36) beta chain - human 
>gi|184105 (M75110) H t K- 
ATPase beta subunit [Homo 
sapiens] 


8.9 


/-> ^ -7 

267 


X64659 


C.jacchus interferon 
gene for interferon 
gamma 


1.4 


' 1486485 


(U28832) US 10 [Gallid 
herpesvirus 1] >gi|1486497 


6.8 


268 


U11825 


Schistosoma 
japonicum structural 
muscle protein 
paramyosin mRNA, 
complete cds. 


0.88 


<NONE> 


<NONE> 


<NONE> 


269 


D84278 


Human DNA for 
CD38, exon 1 


0.68 


3766363 


(AL031907) hypothetical serine 
rich protein 

[Schizosaccharomyces pombe] 


3.0 


270 


M59755 


Bovine lens aldose 
reductase 

pseudogene, 3' end. 


0.67 


<NONE> 


<NONE> 


<NONh> 


271 


M8 1758 


Homo sapiens 
skeletal muscle 
voltage-dependent 
sodium channel alpha 
subunit (SkMl) 
mRNA, complete cds. 


U.OJ 


2437819 


(Z86105) 1,4-beta-glucanase 
[Anaerocellum thermophilum] 




272 


L01965 


Human type IV 
sodium channel alpha 
polypeptide 


0.64 


2437819 


(Z86105) 1,4-beta-glucanase 
[Anaerocellum thermophilum] 


3.5 


273 


U90122 


Danio rerio bone 
morphogenetic 
protein-4 (bmp4) 
mRNA, partial cds 


0.63 


2983532 


(AE000720) formate 
dehydrogenase alpha subunit 
[Aquifex aeolicus] 


7.9 


274 


L41624 


Hylobates lar mucin 
(MUC1) gene, exons 
1-6. 


0.63 


1517808 


(D79215) FGF-10 [Rattus 
norvegicus] 


0.91 j 



5^9 
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j Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant ProteinO 


SEQ 
ID 


! 

I ACCESSION 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












^uo /yjo) coaea ror oy L.'. 




275 


AF030881 


Fugu rubripes sushi 
retrotransposon gag 

Dolvnrotein (pziu} nnri 

pol polyprotein (pol) 
genes, complete cds 


0.63 


1519696 


elegans cDN A ykl2or^)o; codec 
for by C. elegans cDNA 
ykl59h6.3; coded for by C. 
elegans cDNA yk!26f9.3; codec 
for by C. elegans cDNA 
ykl59h6.5 [Caenorhabditis 
elegans] 


i 
1 

0.38 


276 


1 U52909 


Arabidopsis thaliana 
Ul snRNP 70K 
protein gene, 
complete cds 


0.62 


<NONE> 


<NONE> 


<NONE> 


277 


AF008192 


Homo sarnens 
putative GR6 protein 
(GR6) mRNA, 
complete cds 


0.62 


3800934 


(AF100655) contains similarity 
to ser/thr protein kinases 
[Caenorhabditis elegans] 


9.7 


278 


U17081 


Human fatty acid 
binding nrofein 
(FABP3) gene, 
complete cds 


0.62 


3617848 


(AF049709) tyrosylprotein 
suIfotransferase-A; TPST-A 


7.7 


279 


AB018340 


Homo sapiens mRNA 
for KIAA07Q7 
protein, partial cds 


0.62 


424044 


VP5 protein - porcine rotavirus 
>gi|61355 


7.7 


280 


Y0O093 


H.sapiens mRNA for 

lpnlmp'vff 3 nHhf*Qir»n 

glycoprotein p 150,95 


0.62 


1054945 


(U38621) polyprotein [Tobacco 
vein mottling virus] 


4.5 


281 1 


M63138 


Human cathepsin D 
(catD) gene, exons 7, 
8, and 9. 


0.62 


136810 


uLiLUrKU I hUN M 
>gi|73791|pir||WMBE51 UL10 
protein - human herpesvirus 1 1- 
473) [Human herpesvirus 1] 
>gi|221732|*nI|PID|dl002131 


3.5 


282 1 


] 
I 

X76056 i 


M. sylvestris DNA for 
> pacer region 
between 25S and 18S 
•ibosomal RNA senes 


0.62 


( 

2661176 


U76671) putative cds 
Rhodobacter sphaeroides] 


2.0 


283 


] 

X7450I i 


S.taurus mRNA for 
\CTH receptor 


0.62 


4249552 


AB001075) galectin-2 related 
protein j 


2.0 


284 | 


I 

s 

M57634 e 


tat Fl-ATPase beta 
ubunit mRNA, 3' 
nd. 


0.62 


t 
t 

t 

2119692 t 


ransforming growth factor-beta 
ype III receptor - chicken 
>gi|5 11843 (L01121) 
ransforming growth factor-beta 
ype III receptor [Gallus sallus] 


1.5 



WO 01/02568 



PCT/US00/18374 





! Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs Nrm-R^.mHnnr p mto ;„^ 


SEQ 


ACCESSION 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


285 


Y15724 


Homo sapiens 
SERCA3 gene, exons 
1-7 (and joined CDS) 


0.62 


2498164 


ASFAR1 YUASPAKAUINYU" 
BhlA-HVDROXYLASE 

(ASPARTATE BETA- 
HYDROXYLASF^ (A P»FT a 
HYDROXYLASE) (PEPTIDE- 
ASPARTATE BETA- 
DIOXYGENASE) beta- 
dioxygenase (EC 1.14.11.16) - 
bovine >gi| 162694 taurus] 


0.52 


286 


AL010142 


Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 3-72, 
complete sequence 


0.62 


3183206 


HYPOTHETICAL PROTEIN 
KIAA0009 sapiens! 


4e-07 


287 


AB0O8160 


Mus musculus Stat3 
gene, 5-flanking 
region and exon 1 
partial sequence 


0.62 


466097 


HVPOlHJbl lL'AL KD 
PROTEIN ZK353.1 IN 
CHROMOSOME III 
>gi|1078903|pir||S44654 
ZK353.1 protein - 

>gi|289757 (LI 53 13) putative 
[Caenorhabditis elesans] 


le-35 


288 


ABO 18795 


Halomonas marina 
gene for alginate 
lyase, complete cds 


0.62 




(Z.4SJ5Sj) similar to A 1 Pases 
associated with various cellular 
activities (AAA); cDNA EST 
EMBL'Z 146^3 comes from this 
gene; cDNA EST 
EMBL:D75090 comes from this 
gene; cDNA EST 
EMBL:D72255 comes from this 
gene, cuina to 1 yk2U(Je4... 


3e-46 




< 
( 


Human DNA 
sequence from 
:osmid E141E2, on 
:hromosome 22, 
romplete sequence 
Homo sapiens] 


0.61 


<NONE> 


<NONE> 


<NONE> 


290 


I 
t 

UIS259 r 


-luman clone CIITA- 
i MHC class II 
ransactivator CIITA 
nRNA, complete cds. 


0.61 


( 

1483567 


X79983) viral proteinase 
Pseudorabies virus] 


9.8 


291 


t 
F 

X98890 t 


>. tuberosum mRNA 
or inorganic 
>hosphate 
ransporter, StPTl 


0.61 


( 

475724 f 


U0SSS4) protein VIII precursor 
Bovine adenovirus tvpe 3] 


7.6 



Ml 
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SEQ 
ID 


Nearest 
ACCESSIOr* 


Neighbor (BlastN vs. < 
{ DESCRIPTION 


3enbank) 
P VALUE 


Nearest Neiph 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


292 


U70825 


Rattus norvegicus 5- 
oxo-L-prolinase 
mRNA, complete cds 


0.61 


733543 


(U23448) similar to genome 
polyprotein 

(SP:POLG_B VDVN, P 1 97 1 1 ) ; 
alternative splicing to C04A2.7a 


4.4 


293 


L8I667 


Homo sapiens 
(subclone 2_a9 from 
PI H49) DNA 
sequence 


0.61 


2565087 


(U80759) CAGH4 alternate 
open reading frame [Homo 
sapiens] 




294 


AE000760 


Aquifex aeolicus 
section 92 of 109 of 
the complete genome 


0.61 


2811092 


HOMEOBOX PROTEIN HOX- 
A3 (HOX-1.5) homeobox- 
containing transcription factor 
[Mus musculus] 


2.6 


295 


U58512 


Mus musculus Rho- 
associated, coiled- 
coil forming protein 
kinase pi 60 ROCK-1 
mRNA, complete cds 


0.61 


295671 


(LI 1275) selected as a weak 
suppressor of a mutant of the 
subunit AC40 of DNA 
dependant RNA polymerase I 
and III 


1.5 


296 


U27459 


Human origin 
recognition complex 
protein 2 homolog 
hORC2L mRNA, 
complete cds 


0.61 


200285 


(M97900) putative open reading 
frame [Mus musculus] 


0.66 


297 


L36680 


Pisum sativum S- 
adenosylmethionine 
synthase mRNA, 3* 
end. 


0.61 


2285790 


(AB002086) p47 [Rattus 
norvegicus] 


4e-12 


298 


AE000673 


Aquifex aeolicus 
section 5 of 109 of 
the complete genome 


0.61 


3395782 


(AF058446) histone 
macroH2Al ? fGallus onllnql 


UC <- / 


299 


AF086310 ( 


Homo sapiens full 
ength insert cDNA 
:lone ZD51F08 


0.61 


3646450 


(AL031603) conserved 
hypothetical protein. 
[Schizosaccharomyces pombe] 


8e-29 


300 


i 
i 

c 

AJ009675 


^grotis ipsilon 
nRNA for 3-hydroxy- 
J-methylglutaryl 
•oenzyme A 
eductase 


0.61 


( 
i 

s 
( 

4176370 s 


'AC00505S) similar to calcium- 
ndependent phospholipase A2; 
;imilar to AC004392 
PID:g3367519) [Homo 
apiens] 


2e-73 


301 


c 
c 
c 
c 

AC005577 [ 


•lomo sapiens 
hromosome 19. 
osmid F18382B, 
entromeric end, 
omplete sequence 
Homo sapiens! 


0.60 


<NONE> 


<NONE> 


<NONE> 



t>H> 
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SEQ 
ID 


| Nearest 
> 1 

I ACCESSION* 


Neighbor (BlastN vs. < 

4 DESCRIPTION 
Candida albicans 


□enbank) 
P VALUE 


1 Nearest Neich 
| ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


302 


1 U40454 


topoisomerase type I 
(CATOP1) gene, 
complete cds 


! 0.60 


1 <NONE> 


<NONE> 


<NONE> 


303 


JO 1390 


Emericella nidulans 
mtDNA between 
h2/h5 and bh2/b2 
junctions, genes for 
ATPase subunit 6, 
cytochrome oxidase 
subunit 3, seven, 
unidentified proteins, 
twentyfour tRNA's 
and L-rRNA. 


0.60 


1 <NONE> 




<.rN win o 


304 


! LI 1172 


Plasmodium 
falciparum RNA 
polymerase I gene, 
complete cds. 


0.60 


<NONE> 


! <NONE> 


<NONE> 


305 


Z81079 


Caenorhabditis 
elegans cosmid 
F39H1 1, complete 
sequence 
Caenorhabditis 
elegans] 


0.60 


<NONE> 


<NONE> i 




306 


Z49627 


S.cerevisiae 
chromosome X 
reading frame ORF 
YJRI27c 


0.60 1 


118751 


MAJOR DNA-BINDING 
PROTEIN herpesvirus 1 (strain 
1 1) >gi|60327 (X64346) major 
ssDNA-binding protein 
[Saimiriine herpesvirus 2] 


9.6 


307 J 


U94911 


Rattus norvegicus H- 
K- ATPase alpha 2 
gene, alternatively 
spliced products and 
partial cds 


0.60 I 


2213862 


[AF003086) PfSNF2L 
[Plasmodium falciparum] 


7.4 


308 J 


J 
( 

U67476 ( 


Vfethanococcus 
annaschii section 18 
:>f 150 of the 
:omplete genome 


0.60 1 


( 

1749688 


D89240) unnamed protein 
product 


5.7 


309 


1 

j 
c 

U67513 c 


vlethanococcus 
annaschii section 55 
)f 150 of the 
omplete genome 


0.60 


( 

3327421 r 


U97068) zonadhesin [Mus 
nusculus] 


4.3 


310 | 


I 
I 

U57817 c 


Haemophilus ducreyi 
ipoprotein gene, 
omplete cds 


0.60 1 


( 
h 

4008577 f 


AL034491) conserved 
ypothetical protein 
Schizosaccharornyces pombe] 


2.5 



l L 3 
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SEC 
ID 


1 Nearest 
) J 

I ACCESSION* 


Neighbor (BlastN vs. 
4 DESCRIPTION 


3enbank) 
P VALUE 


j Nearest Neigh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
j DESCRIPTION 


roteins) i 
P VALUE 


311 


1 XS0700 


H. sapiens G17 gene 


0.60 


1 422541 


probable protein-tyrosine kinase 

(EC 2 7 1 RTK - Pnrifl^ 

v *•• * • • •» 1 *-/ rv. i r\. - xr acme 
electric ray >gj [290858 


1.5 


312 


1 L42167 


Mus musculus (clone 
R24) rds gene, partia 
cds 


0.60 


j 4220848 


(AF033823) moira [Drosophila 
melano^aster] 


0.51 j 


313 


1 U54777 


Human hMSH6 
mRNA, complete cds 


0.60 


f 2665637 


(AF031087) mismatch repair 
Drotein MSHtfS fA/fn** mnc^ulnci 




314 


J D86985 


Human mRNA for 
KIAA0232 gene, 
complete cds 


0.60 


j 1938462 


(U97006) No definition line 
found [Caenorhabditis eleaansl 


2e-07 


315 


D43964 


Rat liver mRNA for 
Kan-1, complete cds 


0.60 


! 1280135 


(U^376) coded for by C. 
elegans cDNA cm21e6; coded 
for by C. elegans cDNA 
cm01e2: similar to melihtn<;p 
carrier protein 

(thiomethylgalactoside permease 
ID 


c ~ tc I 


316 


U49058 


Rattus norvegicus 
CTD-binding SR-like 
protein rA4 mRNA, 
partial cds 


0.60 


2145091 


(U37500) RNA polymerase II 
largest subunit [Mus musculusl 


le-19 1 


317 


X84388 


U.ruddi 

mitochondrial 12S 
ribosomal RNA 


0.60 [ 


3874247 


(Z70205) predicted using 
Genefinder 


2e-37 


318 I 


AF 125447 


Caenorhabditis 
elegans cosmid 
Y14H12B 


0.59 J 


<NONE> 


<NONE> 


<NONE> I 


319 I 


< 

U20189 | 


Hyoscyamus muticus 
clone cVS2 
vetispiradiene 
synthase mRNA, 
martial cds. 


0.59 1 


<NONE> 


<NONE> 


<NONE> j 


320 J 


j 

t 
s 

M63962 c 


-iuman gastric H,K- 
\TPase catalytic 
ubunit gene, 
'omplete cds. 


0.59 I 


<NONE> 


<NONE> 


<NONE> 


321 1 


1 
( 
P 

o 

A J 132366 C 


-lelicobacter pylori 
strain PI) comB and 
mi/algA (partial) 
enes, and partial ! 
)RFl and ORF2 


059 | 


<NONE> 


<NONE> 


<NONE> 1 
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I Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBIastX vs. Non-Redundanr Pmrpin^ 


SEQ 
ID 


ACCESSION 


[ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










322 


U17289 


transcription factor 
AP-2 (AP-2) gene, 
alternative exon la, 
and isoform 2, partial 
cds. 


0.59 


2459419 


(AC002332) hypothetical 
Drotein TArabidonsis thnlinnal 


0 4 


323 


Z7I466 


S.cerevisiae 
chromosome XIV 
reading frame ORE 
YNLl90w 


0.59 


3875542 


(Z67990) Similarity to Rat 
amiloride-sensiti ve sodium 
channel beta-subunit 


7.3 


324 


Z66493 


Beet soil-borne virus 
genes for 13K, 22K 
and 48K proteins 


0.59 


2119867 


cryV465 protein - Bacillus 
thurinsziensis thurincriensi^l 


7.2 


325 


L41351 


Homo sapiens 
prostasin raRNA, 
complete cds 


0.59 


729212 


CRYSTALLEN J1C crystallin 
Tripedalia cystophora] 


4.2 


326 


X79854 


S.lincolnensis gene 
for 16S ribosomal 
RNA 


0.59 


3702828 


(AF056577) high mobility 
group protein 1.2 


3.2 


327 


AJ223356 


Strongyiocentrotus 
purpuratus raRNA for 
SuDp98 protein 


0.59 


2495704 


HYPOTHETICAL PROTEIN 
KIAA0129 product is novel. 
"Homo sapiens] 


2.5 


328 


X86019 


H. sapiens mRNA for 
PRPL-2 protein 


0.59 


1743341 


(Y 10027) transcription factor 
TEF-I [Mus musculus] 


2.5 


329 


U75528 


Xiphias gladius 
creatine kinase gene, 
partial cds 


0.59 


1845995 


(U69477) envelope glycoprotein 
Human immunodeficiency virus 
type 1] 


2.4 


330 


AC005573 


f-Iomo sapiens 
chromosome 5, PAC 
clone 202el3 


0.59 


2506366 


L>iN A PUL Y iVlERASh 
EPSILON SUBUNIT B DNA- 
directed DNA polymerase (EC 
2.7.7.7) II chain B - yeast . 
(Saccharomyces cerevisiae) 
>gi|786319 (U25842) DNA 
Polymerase epsilon, subunit B 
(Swiss Prot. accession number 
P24482) [Saccharomyces 
cerevisiae] 


1.4 


331 


L19180 


Rat receptor-linked 
□rotein tyrosine 
phosphatase 


0.59 


< 

1235974 


;X967L3) collagen [Globodera 
pallidal 


1.1 


332 


J 
i 

L32090 h 


listeria 

nonocytogenes secA 
:ene, complete cds. 


0.59 


( 

2291129 1 


AF016415) No definition line 
bund (Caenorhabditis elegans] 


0.83 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redunrinnr Prnrpin<n 


SEQ 
1 ID 


ACCESSION 


\ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Xenopus laevis 






(AL031124) hypothetical 




333 


U24433 


syndecan-2 mRNA, 
complete cds. 


0.59 


3355692 


protein SClC2.25c 
[Streptomyces coelicolorl 


0.64 


334 


M23412 


Drosophila 
muscarinic 

acetylcholine receptoi 
mRNA, complete cds 


- 

0.59 


168237 


uvi/oj^foj nyaroxyproiine-ncn 
protein [Helianthus annuusl 


0.22 


335 


AF060729 


Synaphea media 
chloroplast atpB-rbcL 
intergenic spacer 
region, partial 
sequence 




/ j 1 DyO 


HYPUlHEll(JAL6/.b KD 

PROTEIN IN PRPS4-STE20 
INTERGENIC REGION 
>gi|626567|pir||S46825 
hypothetical protein YHLOlOc - 
yeast (Saccharomyces 
cerevisiae) >gi|2289881 
(Ul 1582) No definition line 
found [Saccharomyces 


0. 16 


336 


AF029734 


Xanthobacter 
autotroph icus 
transcriptional 
activator AldR (aldR) 
gene, partial cds; and 
NAD-de pendent 
chloroacetaldehyde 
dehydrogenase (aldB) 
gene, complete cds 


0.59 


2498801 


PERIAXIN 

>gi|2143901|pir||I58157 periaxin 

Idl >>gl|JUJZ:7/ ^Z.ZVO'+y ) 

periaxin [Rattus norvesicus] 


0.13 


337 


< 

X95307 


L\reinhardtii LIS18r- 
l gene 


0.59 


1723781 


h x i tm 1 1LAL 34 V J KD 
PROTEIN IN TAF145-YORI 
INTERGENIC REGION 
>gi|2131717|pir||S64612 
hypothetical protein YGR277c - 
/east (Saccharomyces 
:erevisiae) 

>gi|1323505|gnl|PID|e243248 
Z73062) ORF YGR277c 
Saccharomyces cerevisiae] 


Ie-04 


338 


] 
( 
< 

M24572 


Dictyostelium 
iiscoideum tRNA- 
jlu-GAA gene, clone 
/GluGAAS. 


0.59 


( 
1 

1176186 I 


HYPOTHETICAL 43.3 KD 
ZjTP-BINDING PROTEIN IN 
DACB-RPMA INTERGENIC 
REGION >gi|606121 coli] 


3e-06 


339 


I 

U73733 e 


-luman hMSH6 gene. 
:xon 2 


0.59 


( 

2665637 j. 


AF0310S7) mismatch repair 
>rotein MSH6 [Mus musculus] 


5e-07 
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SEQ 
ID 


1 Nearest 
I 

JACCESSIOr 


Neighbor (BlastN vs, ( 

4 DESCRIPTION 
Escherichia coii 


jenbank) 
P VALUE 


Nearest Neish 
ACCESSION 


bor (BlastX v s . Non-Redundant P 
DESCRIPTION 


roteins) | 
P VALUE 


340 


I D90747 


genomic DNA. (25.2 
25.6 min) 


0.59 


I 134286 


DOLICHOL KINASE 


6e-08 1 


341 


I J052 1 1 


Human desmoplakin 
mRNA, 3' end. 


0.59 


| 246796 


major centromere protein, 
CENP-B [human, Peptide, 594 
aa] 


4e-08 1 


342 


1 L24441 


Loligo pealii kinesin 
light chain mRNA, 
complete cds. 


0.59 


| 547800 


KINESIN LIGHT CHAIN 
(KLC) sea urchin 
(Strongylocentrotus purpuratus) 
>gi|161530 


5e-14 


343 


M25140 


Human cardiac alpha- 
myosin heavy chain 
(MYH6) gene, exons 
2, 3 and 4. 


0.58 


1 <NONE> 


<NONE> 


<NONE> 


344 


L8I932 


Homo sapiens 
(subclone 9_h2 from 
PI H21) DNA 
sequence 


0.58 


<NONE> 


<NONE> 


<NONE> J 


345 


AF037966 


Homo sapiens full 
length insert cDNA 
clone YU51G04 


0.58 


<NONE> 


<NONE> 


<NONE> 1 


346 


Z78574 


Hsapiens flow-sorted 
chromosome 6 TaqI 
fragment, 
SC6pA10Gll 


0.58 


<NONE> 


<NONE> 


<NONE>J 


347 1 


AF068061 


Blattella germanica 
allatostatin 
neuropeptide 
precursor, gene, 
complete cds 


0.58 I 


<NONE> 


<NONE> 


<NONE> 1 


348 J 


AF015592 


Homo sapiens Cdc7 
(CDC7) mRNA, 
complete cds 


058 1 


<NONE> 


<NONE> 


<NONE> 


349 1 


< 

S 
( 

AF028006 a 


Methanosarcina 
3arkeri atp operon: 
\TP synthase beta 
jubunit (atpD), ATP 
synthase epsilon 
;ubunit (atpC), ATP 
.ynthase gene 1 
atpl), ATP synthase 
l subunit subunit (... 


0.58 


( 

3184291 p 


AC004136) putative DNA 
>olymerase III gamma subunit 


9.4 


350 1 


t 

ABO 17032 c 


vlus musculus gene 
or pancreatic trypsin, 
omplete cds | 


0.58 1 


( 

3170561 r 


AF056704) synapsin Ilia 
Rattus norvegicus] 


9.2 1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Dictyostelium 










351 


AF081585 


discoideum 
developmental 
protein DG1110 
(DG1110) gene, 
partial cds 


0.58 


105417 


basic proline-rich peptide IB-8a 
human 


9.2 


352 


AF086322 


Homo sapiens full 
length insert cDNA 
clone ZD53E01 


0.58 


93026 


hypothetical protein - African 
swine fever virus (strain Malawi 
Lil-20/1) >gi|450758 (X71982) 
myeloid differentiation antigen 
homologue [African swine fever 
virus] >gi|903686 (M95672) 
unknown protein 


7.1 


353 


AF088025 


Homo sapiens full 
length insert cDNA 
clone ZC19C04 


0.58 


2384644 


(U92805) thrombospondin-3 
[Xenopus laevis] 


7.0 


354 


AB002339 


Human rnRNA for 
KIAA0341 gene, 
partial cds 


0.58 


2135587 


Ml 30 antigen (cytosolic variant 
2) - human 


5.4 


355 


U67548 


Methanococcus 
jannaschii section 90 
of 150 of the 
complete genome 


0.58 


2911094 ! 


(AL021957) hypothetical 
protein Rv2174 


4.2 


356 


L07868 


Homo sapiens 
receptor tyrosine 
kinase (ERBB4) 
gene, complete cds. 


0.58 


461922 


DECARBOXYLASE (8-10 NM 
CYTOPLASMIC FILAMENT- 
ASSOCIATED PROTEIN) 
(P59NC) 4. 1 . 1 . 1) - Neurospora 
crassa >gi|293948 (L09125) 
pyruvate decarboxylase 
[Neurospora crassa] 
>gi| 1655909. 


4.2 


357 


X03897 


Bacillus subtilis 
sigma 43 operon with 
P23-dnaE-rpoD genes 
(dnaE for DNA 
primase, rpoD for 
RNA polymerase) 


0.58 


1323704 


(U55387) similar to C. elegans 
F38E1.9 gene product encoded 
by GenBank Accession Number 
U41996 [Cricetulus griseus] 


4.1 


358 


D76419 


Desulfovibrio 
vulgaris rbo gene for 
desulfoferrodoxin and 
rub gene for 
rubredoxin, complete 
cds 


0.58 


3420047 


(AC004680) putative protein 
kinase [Arabidopsis thaliana] 


2.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










359 


Z82L74 


sequence from 
cosmid B20F6 on 
chromosome 22, 
complete sequence 
[Homo sapiens] 


0.58 


2145455 


(Y07866) catalase-peroxidase 


2.4 


360 


M33642 


F.solani STI35 
protein gene, 
complete cds. 


0.58 


2896706 


(AL021897) hypothetical 
protein Rv 1069c 


2.4 


361 


U64873 


Mus musculus 
transforming growth 
factor alpha (TGF 
alpha) gene, partial 
cds 


0.58 


3874437 


(Z81038) predicted using 
Genefinder; cDNA EST 
yk488a2.5 comes from this gene 
[Caenorhabditis elegans] 


1.8 


362 


AB 002 132 


Macrophthalmus 
banzai mitochondrial 
DNA for 12S and 
16S rRNA, partial 
and complete 
sequence 


0.58 


2960022 


(AJ224676) rho type GEF 
[Drosophila melanogaster] 


1.8 


363 


AFO70O70 


Caenorhabditis 
elegans MutS 
homolog (msh-5) 
mRNA, partial cds 


0.58 


4098205 


(U75869) Omp22 [Helicobacter 
pylori] 


1.8 


364 


AF045240 


Staphylococcus 
epidermidis plasmid 
pIP1629 mobilization 
protein (mobC 1), 
(orf69-l), (mobAl), 


0.58 


4218117 


(AL035353) protein (fragment) 


0.62 


365 


X61637 


H. sapiens Wilms 
tumor gene 1, exons 8 
and 9 


0.58 


2331059 


(U8821 1) unknown [Gallus 
gallus] 


0.62 


366 


AF039312 


Moraxella catarrhalis 
strain 4223 transferrin 
binding protein A 
{iDpJ\) ana transrerrin 
binding protein B 
(tbpB) genes, 
complete cds; and 
unknown sene 


0.58 


120155 


riDcrv ri\u i ci±>i 
>gi|74229|pir||ERADFM fiber 
protein - mouse adenovirus 1 
>gi|209758 (M30594) fiber 
protein [Mastadenovirus musl] 


0.27 ! 


367 


D87463 


Human mRNA for 
KIAA0273 gene, 
complete cds 


0.58 


3861477 


(U94 177) androgen receptor 
[Pan troglodytes] 


0.12 


368 


U40342 


Mus musculus ninein 
mRNA. complete cds. 


0.58 


4115936 


( AF i 1 8223) No definition line 
found [Arabidopsis thaliana] 


0.004 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















369 


S57235 


CD68=110kda 
transmembrane 
glycoprotein [human, 
promonocyte cell line 
U937, mRNA, 1722 
ntl 


0.58 


2072501 


(U96113) WWP1 [Homo 
sapiens] 


le-04 


370 


U39391 


Mus musculus 
serotonin 1 A receptor 
mRNA, complete cds. 


0.58 


1469876 


(D63481) The KIAA0147 gene 
product is related to adenylyl 
cyclase. [Homo sapiens] 


le-07 


371 


D00056 


Monkey B- 
lymphotropic 
papovavirus genes for 
VP-1,2, 3 and large 
T antigen, complete 
and partial cds, strain 
LPV-76> :: 
gb|M14494|PPMVPl 
M Monkey B- 
lymphotropic 
papovavirus mutant 
(LPV-76) PstI B 
fragment encoding 
VPi, VP2, VP3 and 
T-antieen. 


0.58 


2462069 


(AJ001774) vanadium 
chloroperoxidase 


le-08 


372 


M77182 


Amsacta 
entomopoxvirus 
spheroidin gene, 
complete cds, and 
four vaccinia related 
orfs. > :: 

gb|I16670|I 16670 
Sequence 1 from 
patent US 54767S 1 


0.58 


1730722 


HVPUlHhilCAL 4J.S KD 
PROTEIN IN NCE3-HHT2 
INTERGENIC REGION 
>gi|2131871|pir||S62957 
hypothetical protein YNL035c - 
yeast (Saccharomyces 
cerevisiae) 

>gi|1301880|gnl|PID|e239670 
(Z71311) ORF YNL035c 
[Saccharomyces cerevisiae] 


8e-14 


373 


S72579 


igloo-S=gTOwth- 
associated protein 
GAP-43 homolog 


0.58 


2689720 


(AF037168) DnaJ homologue 
Arabidopsis thaliana] 


7e-14 


374 


AF018165 


Tetraodon fluviatilis 
amyloid precursor 
?rotein mRNA, 
:omplete cds 


0.58 


3219938 


HYPOTHETICAL 34.9 KD 
PROTEIN C57A10. 1 1C IN 
CHROMOSOME I 
>gi|2058378|gnI|PID|e3 14002 
pombe] 


5e-22 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Protein^ 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















J / D 


U81803 


Filobasidiella 
neofcrmans 
translation elongation 
factor EF1 -alpha 
(CnTEFl) mRNA, 
complete cds 


0.57 


<NONE> 


<NONE> 


<NONE> 


376 


U09781 


Candida albicans 
ATCC 18804, CBS 
562 peptide 
transporter gene, 
complete cds. 


0.57 


<NONE> 


<NONE> 


<NONE> 


377 


AC002143 


Homo sapiens 
(subclone 4_bl0 from 
BAC HI 02) DNA 
sequence 


0.57 


<NONE> 


<NONE> 


<NONE> 


378 


U23442 


Tetrahymena 
thermophila RR 
internal deletion 
sequence. 


0.57 


<NONE> 


<NONE> 


<NONE> 


37Q 


U17289 


Mus musculus 
transcription factor 
AP-2 (AP-2) gene, 
alternative exon la, 
and isoform 2, partial 
cds. 


0.57 


<NONE> 


<NONE> 


<NONE> 


380 


X70844 


Buzura suppressaria 
nuclear polyhedrosis 
virus gene for 
polyhedrin protein 


0.57 


<NONE> 


<NONE> 


<NONE> 


381 


< 

AJ012I59 , 


rlomo sapiens 5T4 
Dncofetal trophoblast 
glycoprotein gene 


0.57 


<NONE> 


<NONE> 1 


<NONE> 


382 


J 
] 

X76571 c 


sapiens simple 
DNA sequence region 
;lone wgla8. 


0.57 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ID 


Nearest 
ACCESSICtf 


Neighbor (BlastN vs. ( 
4 DESCRIPTION 


ZJenbank) 
P VALUE 


j Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


383 


AF034434 


" Vibrio cnolerae 1 
pathogenicity island, 
putative transposase, 
aldehyde 
dehydrogenase 
(aldA), toxR- 
activated gene A 
protein (tagA), 
putative inner 
membrane protein, 
and putative zinc 
metalloprotease 
genes, complete cds; 
and... 


0.57 


<NONE> 


<NONE> 


<NONE> 


384 


AB017031 


Mus musculus gene 
for TESP4, complete 
cds 


0.57 


| <NONE> 


<NONE> 


<NONE> 


385 


X89788 


S.hispidus 
mitochondrial DNA 
for SSU ribosomal 
RNA gene 


0.57 


<NONE> 


<NONE> 


<NONE> 


386 


L16921 


Rat progesteron 
receptor gene, 5' 
untranslated resion. 


0.57 ; 


3323116 


(AE001251) femA protein, 
putative [Treponema pallidum] 


8.9 


387 1 


AF027292 


Homo sapiens 
interferon regulatory 
factor 6 


0.57 


259790 


(S48157) DNA polymerase- 
primase 180 kda subunit 
[Drosophila melanogaster. 
Peptide, 1490 aa] 


6.7 


388 1 


AJ012581 


Cicer arietinum j 
mRNA for 
cytochrome P450 


0.57 


2131498 


hypothetical protein YDR446w - 
yeast CAI: 0.11 [Saccharomyces 
cerevisiae] 


5.3 


389 J 


L15363 


Human transfer RNA- 
Met (TRMEP1) 
pseudogene, complete 
^ene 


0.57 


( 

3228680 < 


;AF070935) GABA receptor 
iubunit [Musca domestica] 


5.2 j 


390 1 


AE000525 s 


ltlJ^UUJL LCI JJjiUII 

26695 section 3 of 
134 of the complete 
renome 


0.57 


( 
f 
r 

1938478 e 


U97008) weak similarity to 
amiiy 1 of G-protein coupled 
eceptors [Caenorhabditis 
k legans] 


4.0 


391 1 


a 
e 
( 

AF020189 3 


\mblyomma 
mericanum 
cdysteroid receptor 
AamEcR) mRNA, 
UTR. resion 1 


0.57 


( 

2072224 v 


U94875) p40 [Borna disease 
irus] 


4.0 
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SEQ 
ID 


Nearest 
ACCESSION 


Neighbor (BlastN vs. ( 
\ DESCRIPTION 


ZJenbank) 
P VALUE 


| Nearest Nei eh 
1 ACCESSION 


bor (BlastX v S . Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


392 


X56997 


Human UbA52 gene 
coding for ubiquitin- 
52 amino acid fusion 
protein 


0.57 


1 2960113 


(AL022121) hypothetical 
protein Rv3689 


4.0 


3v3 


AL0I0260 


Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 4-81, 
complete sequence 


0.57 


1 117233 


CV lOc'HkOMt P450 2CI4" 
(CYPIIC14) phenobarbital- 
inducible, hepatic - rabbit P-450 
[Oryctolagus cuniculus] 
>gi|358265|prfl|1306317A 
cytochrome P450 [Oryctolagus 
cuniculus] 


3.9 




! M99581 


Xenopus laevis 
gamma-cry stall in 
(gcry3) gene, 
complete cds. 


0.57 


| 141647 


GASTRULA ZINC FINGER 
PROTEIN XLCGF44.2 
>gi|85736[pir||S06571 finger 
protein (clone XIcGF44-2) - 
African clawed frog (fragment) 


3.0 




M38384 


Drosophila 
melanogaster seven in 
absentia mRNA, 
complete cds. 


0.57 


1707127 


(U80454)T16A1.1 
[Caenorhabditis elegans] 


3.0 


396 


U32795 


Haemophilus 
influenzae Rd section 
110 of 163 of the 
complete genome 


0.57 1 


1173433 


IRON(III)-TRANSPORT 
SYSTEM PERMEASE 
PROTEIN SFUB >gi|152861 
(M33815) protein (sufB) 


2.3 


5\) 1 


XI 2600 


Klebsiella 
pneumoniae nifX, 
nifU, nifS, nifV and 
nifW genes 


057 


2909562 


(AL021925) hypothetical 
protein Rv2256c 


1.4 


398 


AB014526 i 


Homo sapiens mRNA 
for KIAA0626 
3rotein, complete cds 


0.57 I 


1 

482390 i 


insect-stage-specific protein - 
Trypanosoma cruzi >gi| 162099 
[M6;>021) insect stage-specific 
intigen 


0.61 


399 


J 

s 

f 

AF063587 c 


^hodococcus fascians 
;trainNRRL-B- 
15096 hypothetical 
protein gene, 
-omplete cds 


0.57 I 


( 

4104321 j. 


AF0345S2) vesicle associated 
)rotein [Rattus norveaicus] 


0.46 


400 


( 

L11117 s 


juinea pig estrone 
ulfotransferase gene. 


057 1 


a 

82584 C 


Ipha/beta-gliadin precursor 
clone A212) - wheat 


0.35 
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SEQ 
ID 


Nearest 
ACCESSION 


Neighbor (BlastN vs. ( 
r DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


x>r (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


401 


V00829 


" Mouse complete gene 
for a mouse kallikreir 
gene. Genes are mGK 
1 (complete gene) 
and mGK-2 of 
hormones, e.g., 
grow... > :: 
gb|J00390[MUSKAL 
07 Mouse pseudo- 
kallikrein 2, exons 4 
and 5, and kallikrein 
1 gene, complete cds. 


l 

0,57 


2500916 


NUCLEAR HORMONE 
RECEPTOR NOR-2 receptor 
[Rattus norvegicus] 
>gi|1583604|prf]|2121281A 
NOR-2 protein [Rattus 
norvegicus] 


0.20 


402 


X53092 


Chicken mRNA for 
beta-2 subunit of 
neuronal nicotinic 
acetylcholine receptor 


0.57 


1072256 


(U40953) similar to matrin F/G 
(SP:Q00910) containing C4- 
type zinc-fingers 
[Caenorhabditis eleaans] 


0.031 


403 


L07939 


Ovis ovis granulocyte 
colony stimulating 
factor 


0.57 


3874345 


iU3j) predicted using 
Genefinder; Similarity to 
dehydrogenases; cDNA EST 
EMBL:D65800 comes from this 
gene; cDNA EST 
EMBL.D76184 comes from this 
gene; cDNA EST 
EMBL:D69322 comes from this 
gene; cDNA EST 
EMBL:C08158 comes f... 


3e-07 


404 


* 

U 18061 


Colletotrichum 
gloeosporioides 
CAP20 (cap20) gene, 
wUfupicie cas. 


U.j / 


29 1469 j 


(AC003974) putative ubiquitin 
specific protease 


9e-08 


405 


] 

i 
\ 

273955 I 


^.japonicus mRNA 
or small GTP- 
)inding protein, 
IAB11G 


0.57 


] 
] 

\ 

i 

1 12894 ( 


1 UMOK NECROSIS FACTOR, 
ALPHA- INDUCED PROTEIN 
3 (PUTATIVE DNA BINDING 
PROTEIN A20) (ZINC 
-INGER PROTEIN A20) 
>gi|l07549|pir||A35797 
>robable DNA-binding protein 
\20 - human >gi|177S66 
M59465) A20 


7e-0S 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















406 


X04335 


Petunia grp-1 gene 
for glycine-rich 
protein 


0.57 


3876901 


(Z//66U) Similarity to Human 
enoyl-CoA hydratase 
(SW:ECHM_HUMAN); cDNA 
EST EMBL:T0061 1 comes 
from this gene; cDNA EST 
yk203dl0.3 comes from this 
gene; cDNA EST yk203dl0.5 
comes from this gene; cDNA 
EST yk457h5.3 comes from t... 


le-27 


407 


U4U / 1 O 


Rattus norvecncus S- 

adenosylmethionine 

decarboxylase 

(AMDP2) 

pseudogene 


n ^ 

u.oo , 


<1N KJly xi> 


<vlN \Jlv O 




408 


M60318 


S.cerevisiae SSD1 
protein gene, 

gb|AR013983|AR0I3 
983 Sequence 8 from 
patent US 5773245 


0.56 


<NONE> 


<NONE> 


<NONE> 


409 


! X60057 


Nicotiana tabacum 
blp4 mRNA for 
luminai Diriuing 
protein (BiP) 


0.56 


<NONE> 


<NONE> 


<NONE> 


410 




Homo sapiens full 
length insert cDNA 
clone i t\.0D/\uy 


U.JO 




<-.in win 




411 


AL010189 


Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
rrom conug j-iuz, 
complete sequence 


0.56 


<NONE> 


<NONE> ' 


<NONE> 


412 


X05402 


Murine G-CSF gene 
for granulocyte 

rrilnnv mi tl^iti no 

factorj>recursor 


0.56 


<NONE> 


<NONE> 


<NONE> 


413 


U92280 


Rattus norvegicus 
regulator of G-protein 
signalling 12 
(RGS12) mRNA, 
complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


414 


US5660 


Human 

papillomavirus strain 
RTRX7 complete 
genome 


0.56 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neishbor (BlastN vs. Genbank) 


| Nearest Neiehbor (BlastX vs. Non-Redundant Protein.O 1 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


1 ACCESSION 


DESCRIPTION 


P VALUE 
















415 


X57626 


M. javanica 
mitochondrion 
ATPase 6, and 
putative tRNA-f-Met 
and tRNA-His genes 


0.56 


<NONE> 


<NONE> 


<NONE> 


416 


AB003363 


Sus scrofaSlOOC 
gene, complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


417 


L42291 


Danio rerio DANA 
element, intron 4. 


0.56 


2650002 


(AE001062) conserved 
hypothetical protein 
[Archaeoelobus fulgidus] 


8.7 


418 


AF031826 


Mus musculus 
leukocystatin gene, 
complete cds 


0.56 


462493 


L-LACiAlb 
DEHYDROGENASE 
(IMMUNOGENIC PROTEIN 
P36) >gi|479296|pir||S33362 L- 
lactate dehydrogenase (EC 
1 . 1 . 1 .27) - Mycoplasma 
hyopneurnoniae 


6.7 


419 


1 U 17068 


Pennisetum glaucum 
Ac- 1 ike element, 
AcL2. 


0.56 


399449 


ESCARGOT/SNAIL PROTEIN 
HOMOLOG 


6.7 


420 


Z48042 


H.sapiens mRNA 
encoding GPI- 
anchored protein 
pl37 


0.56 


141232 


HYPOTHETICAL 8.7 KD 
PROTEIN (READING FRAME 
D) >gi|76316|pir||QQSA7C 
hypothetical protein E-74 


6.7 


421 


AF027657 


(Jhonstoneura 
fumiferana 
entomopoxvirus 
nucleotide 
triphosphate 
phosphohydrolase I 
(NPHI) gene, 
complete cds 


0.56 


464999 


PUiAl'lVb 

ACETYLCHOLINE 
REGULATOR UNC-1S 
>gi|480359|pir||S36747 
acetylcholine regulator unc-18 - 
Caenorhabditis elegans 
>gi|247392|bbs| 100294 putative 
acetylcholine regulator unc-18 


5.1 


422 


AB011540 


Homo sapiens mRNA 
for MEGF7, partial 

cds 


0.56 


1718033 


URACEL-DNA 
GLYCOSYLASE (UDG) 
herpesvirus 2 >gi|695219 
(U20824) uracil DNA 
glvcosylase 


5.1 


423 


X59941 


X.maculatus NGF 
gene for nerve growth 
actor 


0.56 1 


1169081 


COMMON PLANT 
REGULATORY FACTOR 
CPRF-1 >gi|515621 (X58575) 
light-inducible protein CPRF-1 
"Petroselinum crispum] 
>gi|149S301 (U46217) CPRF1 


3.S 
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SEC 
ID 


1 Nearest 
) J 

1 ACCESSIOf 


Neighbor (BlastN vs. 
^ DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


424 


1 M7?7 1 1 


Rat transcriptional 
repressor of myel in- 
specific genes (SCIP; 
mRNA, complete cds 


> 

0.56 


I 501027 


(U01849) ORF2 [Trypanosoma 
bruceil 


2.3 


425 


1 AL023850 


Caenorhabditis 
elegans cosmid 

1 t ^V7lJ 1 I \ mmnlptp 
J I U / JL/ 1 1 VwiJIJipiCLC 

sequence 

[Caenorhabditis 

elegans] 


0.56 


266771 


LHURlSMAlk MUTASE 

(CM) / PREPHENATE 
DEHYDRATASE (PDT) (P- 
PROTEIN) 

>gi|281791|pir||S26053 
chorismate mutase (EC 5.4.99.5, 
P / prephenate dehydratase (EC 
4.2.1.51) - Erwinia herbicola 
> g i|43344 


) 

2.3 


426 


J U4/OOZ 


Schistosoma mansoni 
gynecophoral canal 
protein mRNA, 
complete cds 


0.56 


1 2147138 


ATP synthase chain 6 - 
Platymonas subcordiformis 
mitochondrion >gi|633582 
(Z47797) ATP synthase subunit 
6 [Platymonas subcordiformis 1 


2.3 


427 


V00574 


Human germ line 

bladder carcinoma 
oncogene T24 (Gene 
code c-Ha-ras-1) with 
four exons. 


0.56 


1518672 


(U60289) receptor protein 
tyrosine phosphatase psi [Homo 
sapiens] 


1.7 




77 i <nn 
/ 1 


XJaevisHKOH 
gene 


0.56 1 


1651674 


(D90899) ferrichrome-iron 
receptor 


1.3 


429 


M37278 


R.norvegicus renin 
gene,exons 1-9. 


0.56 


2853019 


(AF045141) putative serine 
proteinase [Scirpophaga 
incertulas] 


1.0 


430 I 


] 

D28878 < 


Thermus 

liiermopniius poi/\ 
gene for thermostable 
DNA polymerase I, 
:omplete cds 


0.56 1 


( 

3659692 


AF06874S) sphingosine kinase 
Mus musculus] 


0.77 


431 I 


] 

Z15027 ] 


[-{.sapiens HLA class 
II DNA 


056 


1304141 c 


iiDnnogen A-alpna- 

rhain 


0.76 


432 1 


I 

a 

M14362 r 


-luman T-cell surface 
intigen CD2 (Til) 
nRNA, complete cds. 


0.56 f 


( 

2462979 t 


VI 1915) Tenascin-X [Bos 
aurus] ! 


0.59 


433 1 


c 

Z5080 1 fc 


'.mays mRNA for 
hlorophyll a/b- 
inding protein CP29 


0.56 1 


c 

109677 n 


ollagen alpha 1(1) chain - 
louse >2i|50487 


0.50 
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SEQ 
ID 


| Nearest 

Jaccessio 


Neighbor (BlastN vs. Genbank) 
J DESCRIPTION | p VALUE 


Nearest Neighbor (BlastX vs. Non-Redundant P 
ACCESSION | DESCRIPTION 


P VALUE 


434 


J Z38114 


S.cerevisiae 
chromosome XIII 
cosmid 9745 


| 0.56 


140372 


|rl YrU i tit i 1CAL 86.0 KD 
PROTEIN IN GLKI-SR09 
INTERGENIC REGION 
>gi|83159|pir||S 19367 
hypothetical protein YCL039w - 
lyeast (Saccharomyces 
cerevisiae) 


0.35 


435 


AF052254 


Escherichia coii DNA 
gyrase A (gyrA) gene 
partial cds 


k j 

0.56 


2724126 


(AF038535) synaptotagmin VII 
|[Homo sapiens] 


0.12 




1 ArUoUo4y 


i'egula puJligo 12S 
small subunit 
ribosomal RNA gene, 
mitochondrial gene 
for mitochondrial 
RNA, partial 
sequence 


! 0.56 


3913223 


CYCLIN-DEPENDENT 
KINASE INHIBITOR 1 
p21AVAFl [Felis catus] 


0.11 


437 


AJ005690 


Danio rerio mRNA 
for protein tyrosine 
kinase 


0.56 


2623830 


(AF030962) unknown 
[Schistosoma mansoni] 


7e-06 


438 


U31202 


Human noggin 
(NOGGIN) gene, 
complete cds. 


0.56 


3875475 


(Z78411)F02D8.3 
[Caenorhabditis elegans] 


le-06 


dlQ 1 




Ovis sp. trichohyalin 
mRNA, partial 


0.56 


3386622 | 


(AC004665) unknown protein I 
[Arabidopsis thaliana] 


le-10 


440 J 


U28938 


Rattus norvegicus j 
protein tyrosine 
phosphatase D30 | 
mRNA, complete cds 


0.56 


1 
1 

3293547 


(AF072709) putative 
oxidoreductase [Streptomyces 
ividans] 


le-14 


441 1 


AE001I71 


Bon-el ia burgdorferi 

(<tf>rt\r\n ^7 rvf 7fY\ nf 1 

the complete genome 


0.56 


< 

2315521 t 


AF0 16452) similar to the beta 
ransducin family 


4c- 16 


442 1 


i 

t 

AF036685 ( 


Caenorhabditis j 
ilegans cosmid 
305B10 


0.56 


( 

L 

< 

1519671 c 


U67951) contains similarity to 
VTP/GTP-binding site motif ^ 
PS:PS00017) [Caenorhabditis 
:le2ans] 


6e-20 


443 1 


X01173 f 


<enopus laevis j 
/itellogenin gene Al 
)' flanking region j 


0.56 


( 
F 

1118102 e 


U41558) K02B2.3 gene 
>roduct [Caenorhabditis 
legans] 


2e-31 


444 | 


f 

D10911 c 


vlus musculus DNA 
or MS2 protein, 
omplete cds j 


0.55 


<NONE> j 


<NONE> 


<NONE> 
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j | Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 




Rice mRNA EN1I7, 










445 | 


D30010 


partial sequence 


0.55 


<NONE> 


<NONE> 


<NONE> 


446 


U5L991 


Escherichia coli 
phosphoprotein 
phosphatase 


0.55 


<NONE> 


<NONE> 


<NONE> 


447 


Ml 8858 


Mouse T cell recepto 
C-gamma-7.1 mRNA 
3' end. 


- 

0.55 


<NONE> 


<NONE> 


<NONE> 


448 1 


U95218 


Homo sapiens T cell- 
death associated 
protein gene, 
complete cds 


0.55 


<NONE> 


<NONE> 


<NONE> 


449 1 


M 14948 


Human R-ras gene,' 
exon 1. 


0.55 


<NONE> 


<NONE> 


<NONE> 


450 | 




Human mRNA for 
KIAA0355 gene, 
complete cds 


0.55 


<NONE> 


<NONE> 


<NONE> 


451 1 


L81689 


Homo sapiens 
(subclone l_d6 from 
PI H54) DNA 
sequence 


0.55 


! <NONE> 


<NONE> 


<NONE> 


452 | 




Human myristoylated 
alanine-rich C-kinase 
substrate (MACS) 
gene, 5' end. 


0.55 


3322710 


(AE001220) V-type ATPase, 
subunit B (atpB-1) [Treponema 
pallidum] 


5.0 


453 j 


X62953 


R.norvegtcus mRNA 
(pJG116) with 
repetitive elements 


0.55 


1076802 


extensin-like protein - maize 
>gi|600118 mays] 


5.0 


454 1 


< 

i 

< 

L34630 c 


^ynecnocystis sp. 
mntABC transporter 
system: periplasmic- 
>inding protein 
[mntC), complete cds; 
[mntA) gene, 
:omplete cds; 
nembrane protein 
mntB) gene, 
:ompIete cds. 


0.55 


( 

2117632 


lydrogen dehydrogenase (EC 
1.12.1.2) - Clostridium 
icetobutylicum >gi 557064 
U 15277) hydrogenase I 
Clostridium acetobutylicum] 


5.0 


455 | 


I 
r 
P 

U43521 c 


Plasmodium berghei 
nerozoite surface 
>rotein-l gene, 
omplete cds 


0.55 


127654 I 


MYOGLOBIN 


4.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















456 


264937 


H.sapiens CpG DNA. 
clone 17g7, reverse 
read cpgl7g7.rtla . 


0.55 


417298 


MFS 18 PROTEIN 
PRECURSOR 


3.8 


457 


U10914 


irh83 T-cell receptor 
alpha chain mRNA, 
partial cds. 


0.55 


310406 


(L09212) tat protein [Simian 
immunodeficiency virus] virus] 


3.8 


458 




Homo sapiens 
multidrug resistance 
protein 


U.JJ 


1 JOJZJ 1 


traB gene [Amycolatopsis 
rneuitf noiica 1 




459 


M35603 


Mouse Hox-3.1 gene 
and Hox-3.2-Hox-3.1 
intergenic region. 


0.55 


818849 


(U25430) nucleotide 
pyrophosphatase precursor 
[Oryza sativa] 


2.0 


460 




Plasmodium 

fVi Ir*innriim 

chromosome 2, 
section 32 of 73 of 
the complete 
sequence 


U.JJ 


l J I JJZ 


PROTEIN C2 

>gi|74386|pir||WZVZB6 59K 
Hindlll-C protein - vaccinia 
virus (strain ^TR) 


1 7 


461 


AE001395 


Plasmodium 
falciparum 
chromosome 2, 
section 32 of 73 of 
the complete 
sequence 


0.55 


137532 


PROTEIN C2 

>gi|74386|pir||WZVZB6 59K 
Hindlll-C protein - vaccinia 
virus (strain WR) 


1.7 


462 


U59736 


Human transcription 
factor (NFATc.b) 
mRNA, complete cds 


0.55 


3327144 


(ABO 14565) KIAA0665 protein 
[Homo sapiens] 


0.096 


463 


U34860 


Saccharomyces 
cerevisiae origin 
recognition complex 
large subunit (ORC1) 
gene, complete cds 


0.55 


140372 


HYPOTHETICAL RD 
PROTEIN IN GLK1-SR09 
INTERGENIC REGION 
>gi|83159|pir||S 19367 
hypothetical protein YCL039w - 
yeast (Saccharomyces 
cerevisiae) 


0.017 


464 


AFO 12341 


Homo sapiens 
glutaryl-CoA 
dehydrogenase 
(GCDH) gene, exons 
6, 7. 8,9, and 10 


0.55 


1166611 


(U46674) coded for by C. 
elegans cDNA yk27d9.5; coded 
for by C. elegans cDNA 
yk27d9.3; short region of weak 
homology to drosophilia 
suppressor of sable protein 


0.00s 




WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






HIV-1 isolate 










465 


AF004891 


CxA from Kenya, 
envelope 

glycoprotein C2V3 
region (env) gene, 
partial cds 


0.54 


<NONE> 


<NONE> 


<NONE> 


466 


Y10159 


D.discoideum 
racGAP gene 


0.54 


<NONE> 


<NONE> 


<NONE> 


467 


AB001895 


Homo sapiens mRNA 
for B 120, complete 
cds 


0.54 


i <NONE> 


<NONE> 


<NONE> 


468 


X12357 


Bovine gene tor 
as party 1 protease 
NM 1 exons 3 and 4 > 
:: Icl|X 12357 Bovine 
aspartyl protease 
NM1 gene, exons 3 
and 4. 


0.54 


<NONE> 


<NONE> 


<NONE> 


469 


AE001151 


Borrelia burgdorferi 
(section 37 of 70) of 
the complete genome 


0.54 


<NONE> 


<NONE> 


<NONE> 


470 


X92052 


H. sapiens mRNA for 
T cell receptor alpha 
chain 


0.54 


<NONE> 


<NONE> 


<NONE> 


471 


U00938 


Mus musculus ileal 
lipid-binding protein 
gene, complete cds 


0.54 


1009712 


(U27698) calreticulin 
[Arabidopsis thaliana] 


4.9 


472 


X68367 


M.thermoformicicum 
complete plasrnid 
pFZl DNA 


0.54 


125272 


CAS'KIN KINASE 11, ALPHA 
CHAIN (CK II) I 
>gi|419938|pir||A43297 casein 
kinase II (EC 2.7.1.-) alpha 
chain - Theileria parva 
>gi|16l871 (M92084) casein 
kinase II alpha subunit 
[Theileria parva] 


4.7 


473 


Z61098 


H.sapiens CpG DNA, 
clone 44c4, reverse 
read cpg44c4.rtla . 


0.54 


4191274 


(AJ131094) Xvent-IB protein 
[Xenopus laevis] 


3.7 


474 


M63962 


Human gastric H,K- 
ATPase catalytic 
subunit gene, 
complete cds. 


0.54 


3881648 


(Z70757) similar to serine 
protease inhibitor 
[Caenorhabditis elegans] 


3.7 


475 


XS6019 


H.sapiens mRNA for 
PRPL-2 protein 


0.54 


164882S 


(D87963) ETF- related factor- 1 
(ETFR-1) 


2.1 
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Nearest 


Neighbor (BlustN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pmn-ins) 


SEQ 
ID 


ACCESSION 




P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






S.glaucescens genes 










476 


X89O10 


strU, strX, strV and 
strW for 5- 
hydroxy streptomycin 
pruduction and 
transport 
polypeptides 


0.54 


3550345 


(AF084524) cellular repressor 
of El A-stimulated genes CREG 
[Mus musculus] 


0.25 


477 


AB007836 


Homo sapiens mRNA 
for Hic-5, partial cds 


0.54 


1097213 


ORF 1 [Streptomyces 
lavendulae] 


0.15 


478 


U32622 


V^iJIIluIIlUUao 

testosteroni TsaR 
(tsaR), 

toluenesulfonate 
methyl- 

monooxygenase 
oxygenase component 
component (tsaB), 
toluenesulfonate zinc- 
indepedent alcohol 
dehydrogenase.. . 


0.54 


3875351 


(Z96047) DY3.6 
[Caenorhabditis elesans] 


0.006 


479 


D6I394 


Arabidopsis thaliana 
gene for beta-VPE, 
complete cds 


0.53 


<NONE> 


<NONE> 


<NONE> 


480 


D61394 


Arabidopsis thaliana 
gene for beta-VPE, 
complete cds 


0.53 


<NONE> 


<NONE> 


<NONE> 


481 


Z33072 


M.capricolum DNA 
for CONTIG MC097 


0.53 


<NONE> 


<NONE> 


<NONE> 


482 


U45975 


Human 

phosphatidylinositol 
^^^uispnospnaie d- 
phosphatase homolog 
mRNA, partial cds. 


0.53 


<NONE> 


<NONE> 


<NONE> 


483 


Z71324 


S.cerevisiae 
chromosome XIV 
reading frame ORF 
YNL048w 


0.53 


2135586 


VI 130 antigen (cytosolic variant 
1) - human 


2.1 


484 


] 
i 

L32090 


Listeria 

nonocytogenes secA 
gene, complete cds. 


0.53 1 


( 

2291129 1 


^AF016415) No definition line 
found [Caenorhabditis elegans] 


0.70 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearest 
ACCESSION 


Neighbor TBlastN vs. ( 
1 DESCRIPTION 


3enbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) | 
P VALUE 


485 


D86423 


Mus musculus mRN/ 
for HGT keratin, 
partial cds 


0.53 


1235974 


(X96713) collagen [Globodera 
pallida] 


1 0.41 1 


486 


Y 15969 


kappa 21-6 gene, 
partial 


0.52 


<NONE> 


<NONE> 


<NONE> 


487 


M27480 


Mus musculus (clone 
3F9) transcribed 
germline T cell 
receptor gamma chair 
(Tcr-g) mRNA, VJ4 
C4 region. 


i 

0.52 


3875542 


(267990) Similarity to Rat 
ami loride- sensitive sodium 
channel beta-subunit 


4.6 I 


488 


D87004 


Human Hamhdn^ 
DNA for 

immunogloblin light 
chain 


0.52 


1766073 


(U37272) winged helix protein 
CWH-1 [Gallus gallus] 


35 


489 


Z99704 


Human DNA 
sequence from 
cosmid E75B8 on 
chromosome 22 T 
complete sequence 
[Homo sapiens] 


A £ 1 


<NONE> 


<NONE> 


<NONE> j 


490 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
cumpieie cus 


U-3 1 


<NONE> 


<NONE> 


<NONE> 


491 


U32795 


Haemophilus 
influenzae Rd section 
110 of 163 of the 
complete genome 


0.50 


<NONE> 


<NONE> 


<NONE> 


492 


Ml 4602 


riuman myoglobin 
2ene, exon 2. 


0.49 


478384 


helicase homolog glOL protein - 
African swine fever virus 
>gi|414091 (X72951) GlOL 125 
KDa protein 


7.0 J 


493 


3 
1 

D87075 t 


-luman mRNA for 
CIAA0238 gene, 
>artial cds 


0.24 


1 

< 
I 

( 
I 

1938429 e 


fc U97002) similar to 
Schizosaccharomyces pombe 4- 
litrophenylphosphatase 
PNPPASE) (SP:Q00472, 
*ID:g5004) [Caenorhabditis 
ilegans] 




494 


; 

r 
F 

U95102 r 


Cenopus laevis 
nitotic 

hosphoprotein 90 
riRNA. complete cds 


0.23 


<NONE> 


<NONE> 


<NONE> 1 



30 
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SEQ 
ID 


j Nearest 
1 

JaCCESSIOF 


Neighbor (BlastN vs. ( 

J DESCRIPTION 
N.crassa 


3enbank) 
P VALUE 


\ Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


495 


I J05254 


mitochondrial small 
tRNA. 


0.23 


192150 


(L05670) clustrin [Mus 
musculus] 


5.1 


496 


1 XI 6399 


Gene for glutamate 
dehydrogenase (EC 
1.4.1.4), put. bacteria 
on si n 


1 

\ U.ZJ 


790933 


(L07867) invariant surface 
glycoprotein [Trypanosoma 
brucei] 


0.030 


497 


1 AE00H51 


Trenonemn nnllidnm 
section 67 of 87 of 
uic Lumpiete genome 




• - <NONE> 


<NONE> 


<NONE> 


498 


AF026919 


Homo ^anien*; 
amyloid lambda light 
chain variable region 
mRNA, partial cds 


0.21 


<NONE> 


\ <NONE> 


<NONE> 


499 


Z27247 


T~} mplinnoictpf 
i-> - 1 UC 1<JI lUii«J.i>LCi 

mRNA for defensin 


0.21 


<NONE> 


<NONE> 


<NONE> 


500 


Y15608 


Candida albicans 
UBI3 sene 


0.21 


<NONE> 


<NONE> 


<NONE> 


501 | 


V00598 


Human beta-tubulin 


A. T 1 


<NONE> 


<NONE> 


<NONE> 


502 


X79426 


A.thaliana 
microsatellite 
[repeated motif 
(gat)7] 


0.21 


<NONE> 


<NONE> 


<NONE> 


503 J 


X75772 


A.caerulescens 
mitochondrial genes 
for cytochrome b and 
NADH 

dehydrogenase 5 


0.21 


139626 


PROTEIN Tl PRECURSOR 


7.S 


504 


< 
s 

AF028736 r 


Jerratia marcescens 
ite specific 
ecombinase 


0.21 


< 

i 

t 

r 
I 

r 

3645960 ( 


^vidence=predicted by content; 
L-method=genefinder;084; 1- 
-nethod_score=47.46; 1- 
;vidence_end; 2- 
;vidence=predicted by match; 2- 
natch_accession=SWISS- 
>ROT:P23792; 2- 
natch_description=DISCONNE 
:TED PROTEIN.; 2-matc... 


4.6 


505 1 


X97545 g 


i.cerevisiae OST5 
ene 


0.21 


( 

2275631 f 


AF014940) No definition line 
ound [Caenorhabditis ele^ans] 


2.7 
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Nearest Neighbor .(BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















506 


M24543 


Human prostate- 
specific antigen (PA) 
gene, complete cds. 


0.21 


1938527 


(U97012) C04E6.2 gene 
Droduct rCaenorhahditi«s 
elesansl 


2.7 


507 


M62470 


Mouse 

thrombospondin 
(THBS1) gene, 
complete cds. 


0.21 


548563 


kNA R£pLiCaS£ 
POLYPROTEIN 2.7.7.48) - 
Erysimum latent virus 
>gi|3892232 (AF098523) 
replicase protein [Erysimum 
latent virus! 


2.1 


508 


Y I 3544 


Homo sapiens cosmid 
CI 


0.21 


1235710 


(L40584) polyprotein 
[Infectious pancreatic necrosis 
virus] 


2.0 


509 


M24193 


Chicken MHC B 
complex protein (CI 2 
3) mRNA, complete 
cds. 


0.21 


3600102 


(AF090441) extracellular reelin 
[Gallus gallus] 


0.52 


510 


: X97161 


H. sapiens TFE3 gene, 
exon 4,5 & 6 


0.21 


854065 


(X83413) U88 [Human 
herpesvirus 6] 


0.30 


511 


X67649 


R.norvegicus DNA 
sequence for 
LFB 1/HNF1 
promoter 


0.21 


3913114 


TRANSCRIPTION FACTOR 
COUP 2 COUP-TFII - chicken 
>gi|392817 (U00697) orphan 
receptor COUP-TFII [Gallus 
gallus] 


0.004 


512 


U63807 


Fugu rubripes growth 
hormone (GH) gene, 
complete cds 


0.21 


3510505 


(AF030881) pol polyprotein 
"Fu2u rubripes] 


3e-04 


513 


Z95636 


H.sapiens mRNA for 
laminin alpha 5 chain 


0.21 


400350 


NAM7 PROTEIN (NONSENSt! 
MEDIATED MRNA DECAY 
PROTEIN 1) (UP- 
FRAMESHLFT SUPPRESSOR 
1) factor NAM7 - yeast 
(Saccharomyces cerevisiae) 
>gi|4023 


le-07 


514 


U91907 


Mirounga leonina 
major 

histocompatibility 
complex class II 
(DQA) gene, partial 
cds 


0.20 


<NONE> 


<NONE> 


<NONE> 


515 


Z35758 


Transmissible i 
gastroenteritis virus 
TFI virion protein 
genes 


0.20 


<NONE> 


<NONE> 


<NONE> 


516 


X00334 


Drosophila virilis 
simple DNA 
sequence (pDv-19) 


0.20 


<NONE> 


<NONE> 


<NONE> 



3^5 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOIN 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















517 


M76741 


Homo sapiens biliary 
glycoprotein (BGP) 
gene, partial cds. 


0.20 


<NONE> 


<NONE> 


<NONE> 


518 


D73515 


Mus musculus rae28 
gene, exon 1 and 
5'flankinH region 


0.20 


<NONE> 


<NONE> 


<NONE> 


519 


M62975 


Drosophila 
melanogaster RNA 
polymerase II second 
largest subunit 
upstream (DmRP 
140) gene, exons 1-4. 


0.20 


<NONE> 


<NONE> 


<NONE> 


520 


M27260 


Chicken 78-kD 
glucose-regulated 
protein, complete cds. 


0.20 


<NONE> 


<NONE> 


<NONE> 


521 


AF076470 


Rice tungro 
bacilliform virus 
Serdang strain, 
complete genome 


0.20 


<NONE> 


<NONE> 


<NONE> 


522 


AF076470 


Rice tungro 
bacilliform virus 
Serdang strain, 
complete genome 


0.20 


<NONE> 


<NONE> 


<NONE> 


523 


U04636 


Human 

cyclooxygenase-2 
(hCox-2) gene, 
complete cds. j 


0.20 


<NONE> 


<NONE> 


<NONE> 


524 


AE001430 


Plasmodium 

frilrinnrt i m 

lulv J UcU LI 1 1 1 

chromosome 2, 
section 67 of 73 of 
the complete 
sequence 


0.20 


<NONE> 


<NONE> 


<NONE> 


525 


] 

I 

( 

AF043514 c 


VIus musculus 
uhosphomannomutase 
Pmm2) mRNA, 
:ompIete cds 


0.20 


] 
1 

3025006 [ 


HYPOTHETICAL KD 
PROTEIN IN MOAE-RHLE 
[NTERGENIC REGION 
>gi|1787009 (AE000181) orf, 
lypothetical protein 
Escherichia coli] 


9.S 
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Nearest 


Neighbor CBIastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


| ACCESSION 


DESCRIPTION 


P VALUE 
















526 


U23144 


Xenopus laevis FTZ- 
Fl -related nuclear 
orphan receptor 
variant (xFFlrAshort 
mRNA. complete cds 


) 

U.20 


3 1 84402 


(AB014477) period protein 
[Chymomyza costatal 


9.6 


527 


U14621 


Paracentrotus lividus 
Pax-6 (suPax-6) 
mRNA, complete cds. 


0.20 


465894 


PKUBABLh MlLKUSUiMAE- 
SIGNAL PEPTIDASE 23 KD 
SUBUNIT (SPC22/23) 
>gi|630688|pir||S44854 
K12H4.4 protein - 
Caenorhabditis elegans 
>gi|289708 (L14331) homology 
with signal peptidase; coded for 
by C. elegans cDNAs GenBank: 
M79661, M79662 and M79663; 
putative 


7.7 


528 


AF030511 


Actinobacillus 
pleuropneumoniae 
MRP ATPase 
homolog (mrp) gene, 
partial cds; ApxIVA 
var3 (apxIVA) gene, 
cumpieic ccis, dnu 
beta-galactosidase 
(lacZ) gene, partial 
cds 


0.20 


1175966 


HYPOTHETICAL 45.3 KD 
PROTEIN IN THI5 5'REGION 
>gi| 1 084720|pir||S56 1 93 j 
probable membrane protein 
YFL062w - yeast 
(Saccharomyces cerevisiae) 


7.2 


529 


AF070581 


Homo sapiens clone 
24540 mRNA 
sequence 


0.20 


542394 


glyoxal oxidase (EC 1.2.3.-) 
precursor - basidiomycete 
(Phanerochaete chrysosporium) 
>gi 11050302 


5.8 


530 


X75437 


T.maritima pgK gene 
for 3- 

Dhosphoglycerate 
cinase 


0.20 


825648 


Z34531) coproporphyrinogen 
Dxidase [Homo sapiens] 


5.8 


531 


] 
i 

U32686 c 


Haemophilus 
nfluenzae Rd section 
1 of 163 of the 
romplete genome 


0.20 


( 

3309593 c 


AF07287S) ciliary outer arm 
iynein beta heavy chain 


5.6 


532 


< 

c 
r 

228081 


S.cerevisiae 
v hromosome XI 
eading frame ORF 
fKL081w 


0.20 


( 
I 
1 

s 

2507201 a 


:arbon catabolite 
derepressing protein 

CINASE >gij 1469803 (L78129) 
erine/threonine kinase [Candida 
Ibicans] 


5.5 
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SEQI 
ID 


< Nearesi 
ACCESSIOI 


Neighbor (BlastN vs. 
^ DESCRIPTION 

rinfn<*lim \/ 1 1 1 rr ^ rc* 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor fBlaStX VS Nnn-R^rlunHinr c 

DESCRIPTION 


'rote ins) 
P VALUE 


533 


AF022725 


limit dextrinase 
(HvLD99) gene, 
complete cds 


0.20 


3139154 


(AF064077) adrenocorticotropi 
hormone receptor [Sus scrofa] 


4.3 


534 1 


AL021726 


Drosophila 
melanogaster cosmid 
171E4 


0.20 I 


3885334 


(AC005623) putative argonaute 
protein TArabidopsis thaliana] 


2.6 1 


535 1 


AB0ni06 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.20 1 


4008334 


(Z92824) B0413.4 
[Caenorhabditis elesans] 


15 1 


536 1 


Z46606 


H.sapiens HLTF gene 
for helicase-Iike 
transcription factor 


0.20 1 


132946 


60S R1BUSOMAL PRUTEIN""" 
L30B (RP29) cytosolic - yeast 
(Saccharomyces cerevisiae) 
>gi|171821 not determined) 
[Saccharomyces cerevisiae] 
>gi| 1045254 cerevisiae] 
>gi|1323250|gnl|PID|e243708 
(Z72933) ORFYGR148c 
[Saccharomvces cerevisiae] 


1.5 


537 1 


X87193 


H.sapiens mRNA for 
2. 1 9 gene 


0.20 i 


139820 


DNA-REPAIR PROTEIN 
XRCC1 


1.5 1 


538 J 


L77965 


Clostridium 
perfringens C beta 2 
toxin gene, complete 
cds 


0.20 J 


1175950 


HVH01RL11CAL3TTKD 

PROTEIN IN SEC53-ACT1 
INTERGENIC REGION 
>gi|1084703|pir||S5621i 
probable membrane protein 
YFL044c - yeast 
(Saccharomyces cerevisiae) 
>gi[83671 l]gnl|PID|dl009835 
(D50617) YFL044C 


1.4 I 


539 1 


< 

M15938 ( 


Chicken neural cell- 
ldhesion molecule (N 
2AM) gene, exon 18. 


0.20 I 


2133082 


•egulatory protein MSR1 - yeast 


11 


1 £ 
1 r 

540 1 AJ003220 1 


Jolanum tuberosum 
nRNA for extensin- 
ike protein, partial 


0.20 


( 

c 

2496932 e 


-1YPOTHETICAL ^.9 KD 
PROTEIN C56G? 1 IN 

:hromosome hi 

>gi|7264l3 (U23177) C56G2.1 
ene product [Caenorhabditis 
Iegans] 


1.1 I 


541 1 


X98108 A 


k.thaliana psbP gene 


0.20 1 


E 
F 

P 

(- 

119227 rr 


EPIDERMAL GROWTH 
ACTOR PRECURSOR 
recursor - mouse >gi|3092l0 
r003S0) prepn>egf [Mus 
lusculus] 


0.49 1 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

P I ACCESSION 



542 I AB01U79 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE | ACCESSION 



Homo sapiens mRNA 
for KIAA0607 
protein, partial cds 



DESCRIPTION 



0.20 



2143753 



gene VGF protein - rat 
>gi|205690 (M60525) nerve 
growth factor inducible protein 
[Rattus norvegicus] >gi|205701 
(M60522) nerve growth factor- 
inducible protein [Rattus 
norvegicus] >gi [20765 1 



P VALUEl 



543 1 X75318 



H.sapiens ITIH1 gene 
(exon 22) and ITIH3 
gene 



0.20 



544 I AB008374 



Oncorhynchus mykiss 
mRNA for alpha 3 
type I collagen, 
martial cds 



629557 



U09809 



Limulus polyphemus 
arginine kinase 
mRNA, complete cds 



546 1 AB020671 



Homo sapiens rnRNA 
for KIAA0864 
rotein, partial cds 



protein, pa; 
Pnytopntnc 



547 I L04457 



54$ | L04457 



'nytopntnora 
megasperma 
mitochondrial 
ORF 152, complete 
cds, cytochrome c 
oxidase subunit I 
(coxl) gene, 
complete cds, 
cytochrome c oxidase 
subunit II 



0.20 



1082610 



0.20 



3882016 



RNA-binding protein rnpD - 
Arabidopsis thaiiana (fragment) 
>gi|5 10240 (X61108) RNA 
binding protein [Arabidopsis 
thaiiana] 



rnufl protein - human 
>gi|762953 (X86018) rnufl 
[Homo sapiens] 



(AJ0 12650) CP [Papaya 
ringspot virus] 



0.20 



2674350 



rnytopmnora 
megasperma 
mitochondrial 
ORF152, complete 
cds, cytochrome c 
oxidase subunit I 
(coxl) gene, 
complete cds, 
cytochrome c oxidase 
subunit II 



0.20 



746516 



(U93121)M-phase 
phosphoprotein-1 [Homo 
sapiens] 



(U23517)DI022.7 
Caenorhabditis elegans] 
>gi|325S651 elegans] 



0.20 



746516 



(U235I7) D1022.7 
[Caenorhabditis elegans] 
>gi|325.8651 elegans] 



0.38 



0.37 



0.37 



0.IS 



0.04: 



0.042 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



549 



S82819 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



DESCRIPTION I P VALUE I ACCESSION 



Cdior=cycIm 



dependent kinase 5 
regulatory subunit 
p35 [mice, brain, 
129/SvJ, C57BL/6, 
Genomic/mRNA, 
5523 nt] 



550 



D31792 



Streptomyces griseus 
DNA for 
serine/threonine 
protein kinases, 
complete cds 



551 



U97499 



552 | U3I463 



553 



X78401 



Homo sapiens 
butyrophilin (BT3.2) 
gene, exons 5-10, and 
complete cds 



Rattus norvegicus 
nonmuscle myosin 
heavy chain-A 
mRNA. complete cds 



0.20 



3413870 



0.20 



861405 



DESCRIPTION 



P VALUEI 



(AB007923) KIAA0454 protein 
Homo sapiens] 



0.20 



2773341 



Bacteriophage P22 
right operon, orf 48, 
replication genes 18 
and 12, nin region 
genes, ninG 
phosphatase, late 
control gene 23, orf 
60, complete cds, late 
control region, start 
of lysis gene 13 



0.20 



3880111 



554 \ X57310 



Nocardia 

lactamdurans pcbAB 
and pcbC genes for 
alpha-aminoadipyl-L- 
cysteinyl-D-vdine 
synthetase and 
isopenicillin N 
synthase 



0.20 



1123087 



0.20 



1723511 



(U29154) T07F12.2 gene 
product [Caenorhabditis 
elegans] 



(AF040954) putative protein 
phosphatase 1 nuclear targeting 
subunit [Rattus norvegicus] 



0.020 



0.019 



0.008 



(281130) predicted using 
Genefinder 



(U42436) C49H3.3 gene 
product [Caenorhabditis 
elegans] 



PUTATIVE ENDONUCLEASE 
C1FI2.06C yeast 
(Schizosaccharomyces pombe) 
gi|I2179S0 (Z69944) unknown 
Schizosaccharomyces pombe] 



0.002 



4e-04 



555 



X62386 



S.epidermidis genes 
epiY', epiY, epiA, 
epiB. epiC, epiD, 
epiQ. epiP 



0.20 



3874927 



(Z73424) C44B9.1 
[Caenorhabditis elegans] 



3e-10 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor CBlastN vs. Genban k) 

SEQ " 

ID I ACCESSION 



561 



5561x59000 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Nnn_ Re dundant Protein" 



P VALUE I ACCESSION 



Epizootic 

haemorrhagic disease 
virus gene segment 6 
for NS1 



557 I M98776 



558 I AF0 11446 



Human keratin 1 
gene, complete cds 



Mus musculus 
granzyme K gene 
complete cds 



DESCRIPTION 



0.20 



3879755 



0.20 



1086900 



559 I AF074708 



560 | X13287 



Z49509 



562 I D89041 



Macaca mulatta clone 
MMU1.5 FRGMike 
[pseudogene, exons 7 
and 8, partial 
sequence 



0.19 



<NONE> 



Medicago sativa 
nodulin-25 gene 



S.cerevisiae 
chromosome X 
reading frame ORF 

YJR009c 

Bovine DNA for 



prostaglandin 
F2alpha receptor, 
partial cds 



563 | D29644 



564 I AE00L461 



565 



L38559 



566 1 Z8262S 



Streptococcus 
salivarius DNA for 
dextranase 



Helicobacter pylori, 
strain J99 section 22 
of 132 of the 
complete .geno me 
Homo sapiens 
galactocerebrosidase 
(GALC) gene, exon 
17. 



R.prowazekii 
genomic DNA 
fragment (clone 
A405F) 



0.19 



<NONE> 



0.19 



<NONE> 



0.19 



<NONE> 



(Zduzzu) s imilar to nucleotide 
binding protein; c 
EMBL:M75897 comes from this 
gene; cDNA EST 
EMBL:M89054 comes from this 
gene; cDNA EST 
EMBL:D26713 comes from this 
gene; cDNA EST 
EMBL.D26718 comes from this 
gene; cDNA... 



P VALUE! 



(U41278) contains similarity to 
G beta repeats 



0.19 



<NONE> 



0.19 



<NONE> 



0.19 



0.19 



<NONE> 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



8e-16 



2e-30 



<NONE> 



<NONE> 



<NONE> | 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 
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SEQ 
ID 


Nearest 


Neighbor (BlastN vs. ( 


3enbank) 
j P VALUE 


Nearest Neieh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


567 


U25641 


Tetrahymena 
thermophila 
telomerase 
component p80 
mRNA, complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


568 


AB002343 


Human mRNA for 
KIAA0345 gene, 
complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


569 


D 10064 


Erwinia carotovora 
gene for pectate lyase 
III, complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


570 


U31734 


Homo sapiens clone 
MF1I8 A4A10 
hypoxanthine 
phosphoribosyltransfe 
rase(hprt) 130 kb 
deletion mutant 
mRNA, partial cds, 
contains human AJu 
element 


0.19 


<NONE> 


<NONE> 


<NONE> 


571 


AE0O1386 


Plasmodium 
falciparum 
chromosome 2, 
section 23 of 73 of 
the complete 
sequence 


0.19 


<NONE> 


<NONE> 


<NONE> 


572 


M95623 


Homo sapiens 
hydroxymethylbilane 
synthase gene, 
complete cds. 


0.19 


<NONE> 


<NONE> 


<NONE> 


573 


( 

S67478 < 


(GC*IS)=vitamin D- 
binding protein/group 
specific component 

hiimnn nprinh^rnl 

Dlood leukocytes, 
jenomic, 794 nt, 
>egment 4 of 9] 


0.19 


<NONE> 


<NONE> 


<NONE> 


574 


I 

X99075 c 


Lsapiens NRGN 
>ene, exon 1 


0.19 


<NONE> 


<NONE> 


<NONE> 


575 


I 
t 
r 

AF044775 s 


-lomo sapiens 
>reakpoint cluster 
egion BCRderl4 
equence 


0.19 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Prot^inO 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


| DESCRIPTION 


P VALUE 






Human mRNA for 










576 


AB002333 


KIAA0335 gene, 
complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


577 


U53566 


Macaca mulatta pit- 
1/GHF-I 

transcription factor 
mRNA. complete cds 


0.19 


1078068 


probable membrane protein 
YLR3 1 lc - yeast 


9.2 


578 


U73664 


Human 

t(ll;14)(ql3;q32) 
breakpoint junction 
sequence 


0.19 


116734 


COAT PROTEIN (CAPS ID 
PROTEIN) virus >gi|58901 
(X62133) CyMV coat protein 
gene product 


8.8 


579 


AF004054 


Heterophyllaea 
pustulata rpsl6 gene, 
chloroplast gene, 
partial intron 
sequence 


0.19 


1928991 


(U92815) heat shock protein 70 
precursor [Citrullus lanatusl 


8.7 


580 


Z2708 1 


Caenorhabditis 
elegans cosmid 
M01A8. complete 
sequence 
Caenorhabditis 
elegansl 


0.19 


2496247 


HYPOTHETICAL A TP- 
BINDING PROTEIN MJ0625 
>gi|2128413|pir||A64378 
hypothetical protein MJ0625 - 
Methanococcus jannaschii 
>gi| 159 1336 (U67510)M. 
jannaschii predicted coding 
region MJ0625 


8.6 


581 


Z74145 


S.cerevisiae 
chromosome IV 
reading frame ORF 
YDL097c 


0.19 


1174425 


TYROS INE-PROTEIN 
KINASE SPK-1 


6.7 


582 


D38547 


Small round 
structured virus 
genomic RNA, 
3 'terminal sequence 
containing ORF2 and 
0RF3 


0.19 


971318 


[Z48053) putative protein 
[Bovine herpesvirus 1] 


5.1 



WO 01/02568 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSIONS 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















583 


D88000 


DNA ms ribosomal 

RNA > :: 

dbj|D880O2|D88002 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA > :: 
dbj|D88003|D88003 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA > 
dbj|D88004|D88004 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA 


0.19 


3800952 


(AF100657) No definition line 
found [Caenorhabditis eleaans] 


5.1 


584 


U67462 


Methanococcus 
jannaschii section 4 
of 150 of the 
complete genome 


0.19 


3183617 


(AJ005586) MYB-related 
transcription factor 
[Antirrhinum majus] 


4.0 


585 


L23906 


Gallus domesticus 
microsatellite DNA 
marker. 


0.19 


1947094 


(U93074) voltage-gated sodium 
channel homolog BdNal 


3.9 


586 


AE001462 


Helicobacter pylori, 
strain J99 section 23 
nf I 37 n f the* 
complete genome 


0.19 


1730177 


GLUCOSE-6-PHOSPHATE 
ISOMERASE (GPI) 
ISOMERASE) (PHI) 
>gi|2118333|pir|[I48073 glucose 
phosphate isomerase - Chinese 
hamster >2i|987046 sriseus] 


3.9 


587 


i 

M 19460 c 


P.putida catBC 
operon encoding 
^is,cis-muconate 
actonizing enzyme I 
md muconolactone 
somerase genes, 
romplete cds. 


0.19 


] 

\ 
< 

3873843 1 


(jDUlbb) cDNA hSl 
yk251g7.3 comes from this 
gene; cDNA EST yk251g7.5 
comes from this gene; cDNA 
EST EMBL:D68223 comes 
from this gene; cDNA EST 
EMBL:C 12737 comes from this, 
gene; cDNA EST yk389cS.5 
:omes from this gene; cDNA 


3.9 


588 


t 

c 

U22349 s 


retrahymena australis 
elomerase RNA 
>ene, complete 
equence 


0.19 


( 

4105782 [ 


AF049922) PGP169-12 
Petunia x hvbrida] 


3.2 




WO 01/02568 
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SEQ 
ID 


Neares 
ACCESSIO 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
P VALUE 


Nearest Nei? 
ACCESSION 


frbor (BlastX vs. Non-Redundant ] 
DESCRIPTION 


^roteins) 1 
P VALUE 1 


589 I L27745 


Homo sapiens voltas 
operated calcium 
channel, alpha- 1 
subunit mRNA, 
complete cds. 


' C 1 

0.19 J 3763926 


(AC004450) unknown protein 
[Arabidopsis thalianal 


3-0 1 


590 1 AF049588 


Canis familiaris 
synapsin I gene, 
jamai cas 


0.19 J 4104931 


(AF042196) auxin response 
factor 8 [Arabidopsis thaliana] 


3 0 1 


591 1 X06627 


Staphylococcus 
aureus plasmid pS194 
sequence 


0.19 I 137927 


PRE-NELk APPENDAGE 
PROTEIN (LATE PROTEIN 
GP12) >gi|75856|pir||WMBP22 
gene 12 protein - phage phi-29 
>gi|2 15330 (M14782) pre-neck 
appendage protein 
[Bacteriophage phi-29] 
>gi|225367|prfl|1301270G gene 
12 [Bacteriophage phi-29] 


2.3 1 


592 1 X61597 


M.muscuius gene for 
kailikre in- binding 
protein 


019 J 2982874 


(AE000675) cobalamin 
synthesis related protein CobW 


1.7 1 


593 I AF016242 


Dictyostelium 
discoideum protein 
synthesis elongation 
factor 1 -alpha (tef2) 
gene, partial cds 


0.19 1 133659 


PUTATIVE RNA-DIRECTED 
RNA POLYMERASE 


14 


I 
1 

1 I 
1 I 
1 1 
1 t 

1 * 

1 F 

594 j AF004447 p 


Venezuelan equine 
encephalitis virus 
strain 1327 
oolyprotein gene, 
3artial cds > :: 
>b|AF004460|AF004 
f w v ciic.£.iieiun 
iquine encephalitis 
'irus strain 1335 
>o!yprotein gene, 
artial cds 


( 

0.19 1 4096173 p 


U25968) early embryogenesis 

rrifptn ii1t*-i/-t^ rntiiml 

ruLcin [^ryza sativaj 


i i 1 


I h 

1 o 
1 c 

P 

595 1 J04821 6 


luman elastin (ELN) 
ene, exon 1, clones 
ELC-5 and HELC- 


I I 
1 F 

1 b 

0-19 j 1170523 fl 


NHIBIN BE 1 A B CHAIN 
RECURSOR inhibin precursor 
bovine >gi|563753 (U16241) 
etaB inhibin/activin precursor 
3os taurus] 


1.3 




1 H 
1 d 

596 1 AF059650 cr 


omo sapiens histone 
^acetylase 3 
-IDAC3) gene, 
)mplete cds 


f P 

1 P 

> 

0-19 | 3024881 O 


ROB ABLE TRANSPORT 
ROTEIN CY21C12.1 1 
gi|2078066|gnl|PID|e315171 
:95210) betP 


0.83 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



597 



M69053 



598 | AF076279 



DESCRIPTION 



D.melanogaster 
calcium-activated K+ 
channel subanit 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE | ACCESSION 



DESCRIPTION 



Dictyostelium 
firmibasis plasmid 
Dfpl, complete 
Jlasmid sequence 



0.19 



1707984 



rtxxiiiJUAll^-UbPbNDbiN I" 



oLUlAMAlkSVNlHASEl 

(FD-GOGAT) 
>gi|2126524|pir||S60228 
glutamate synthase (ferredoxin) 
(EC 1.4.7.1) gltB- 
Synechocystis sp. (PCC 6803) 
>gi|5 15938 (X80485) glutamate 
synthase 



p value! 



0.19 



453986 



(U00008) yejA [Escherichia 
coli] 



0.80 



599 I D28873 



600 



U06071 



601 



L54057 



602 



X89806 



603 1 AE0011Q4 



Mouse MCNP gene 
for C-type natriuretic 
peptide, complete cds 
(exonL exon2) 



Oxytricha nova 
macronuclear actin II 
gene, comple te cds. 



Homo sapiens CLP 
mRNA, partial cds. 



P.lividius cDNA for 
COLL2alpha gene 



Archaeoglobus 
fulgidus section 3 of 
172 of the complete 
genome 



604 | U54501 



605 



X74468 



606 



U20285 



Rattus norvegicus 
microsatellite 
sequence D0Mco22 



Human 

papillomavirus type 
15 genomic DNA 



0.19 



2650444 



(AE001092) acetyl-CoA 
synthetase (acs-1) 
[Archaeoglobus fulgidus] 



0.63 



0.19 



1584024 



complement control protein 
Botryllus schlosseri] 



0.19 



3036883 



(AL022374) putative ABC 
transporter 



0.48 



0.19 



3638957 



(AC004877) sco-spondin-mucin 
like; similar to P98167 uncertain 
Tlomo sapiens] 



0.46 



0.41 



0.19 



2315192 



(Y 11739) transcription factor 

'Homo sapiens] 

D-MeAsp ~~ 



0.35 



0.19 



228951 



receptor:ISOTYPE=epsi!on3 
r Mus musculusl 



0.32 



Human Gpsl (GPS1) 
mRNA. complete cds 



0.19 



3695390 



(AP096371) contains similarity 
to Rattus norvegicus cyclin G- 
associated kinase (SW:P97874) 
Ara bidopsis thaliana] 



0.2S 



0.19 



2582659 



(AJ002527) giucitoI-6- 
phosphate dehydrogenase 
[Clostridium beijerinckii] 



607 I D49408 



Human gene for 
interleukin 3 receptor 
alpha subunit, exon 
10 



0.19 



252236S 



(AF008596) alpha 1,3- 
fucosyltransferase [Helicobacter 
pylori] . 



0.16 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


1 Nearest 
) 1 

jACCESSIOr 


Neighbor (BlastN vs. 
4 DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 1 
DESCRIPTION j p VALUE I 


608 


j AF041141 


Homo sapiens 
pituitary specific 
homeodomain proteii 
(PROP1) gene, exon 
3 and complete cds 


i 

0.19 


37403 


(X03541) trk gene product (aa 1 


— 

0.09 1 1 


609 


J L12531 


Discopyge ommata 
Ca2+ channel alpha i 
subunit gene 
sequence. 


0.19 


3618274 


nypotneucai protein 


U.0o9 1 


610 


1 AF052445 


Yellow fever virus 
clone HONG9 
polyprotein gene, 
complete cds 


0.19 


1932822 


(U 15928) KH-domain putative 
RNA binding protein 


0.001 j 


611 


Z36946 


B.anthracis sap gene 
encoding S- layer 
protein 


0.19 


173241 


(L06487) ZIP1 protein 


j 2e-Q4 j 


612 


AF087984 


Homo sapiens full 
length insert cDNA 
clone YW29A12 


0.19 


3786014 


(AC005499) hypothetical 
protein [Arabidopsis thaliana] 


le-06 1 


613 


AE001010 


Archaeoglobus 
fulgidus section 97 of 
172 of the complete 
genome 


0.19 


3135493 


(AF060248) unknown 
[Arabidopsis thaliana] 


7e-08 j 


614 1 


L08965 


Trichosporon j 
cutaneum carbamoyl 
phosphate synthetase 
large subunit (argA) 
gene, partial cds. 


0.19 


1086901 


(U41278) F33GP 3 oene 
product [Caenorhabditis 
elesans] j 


2e-08 J 


615 1 


M91466 < 


Rattus norvegicus 
A2b-adenosine 
receptor mRNA, 
:omplete cds. 


0.19 


2984320 


'AE000773) acetoin utilization 
Drotein [Aquifex aeolicus] [ 


6e-09 j 


616 


c; 

X95971 s 


j.lividans groEL2 
:ene 


0.19 


1 
i 

1 

3 

c 
C 

3925^77 r 


,AlX)J264j) similar to 
Jncharacterized protein family 
JPF0034, Double-stranded 
*NA binding motif; cDNA EST 
/k489b3.5 comes from this 
rene; cDNA EST yk439g7.5 
omes from this gene 
Caenorhabditis elegansl | 


7e-10 I 


617 | 


5 
P 

U12539 o 


Jchizosaccharomyces 
ombe scd2 (scd2) 
ene, complete cds. 


0.19 


( 
F 

F 

193S549 ( 


U97016) similar to drosophila 
Ucl gene product ribosomal 
roteinL4(YML4) 
NID:g459259) 


3e-l4 1 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ I 

ID I ACCESSION I DESCRIPTION 



618 



U12539 



619 



Z68327 



Schizosaccharornyces 
pom be scd2 (scd2) 
gene, complete cds. 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE I ACCESSION 



DESCRIPTION 



(U97016) similar to drosophila 



0.19 



1938549 



Human DNA 
sequence from 
cosmid U25D11, 
between markers 
DXS366 and DXSS7 
on chromosome X. 



620 



U66525 



621 | U25830 



Dictyostelium 
discoideum 
ORFvegl 14 rnRNA, 
complete cds 



0.19 



3875774 



'Rlcl gene product ribosomal 
protein L4 (YML4) 
(NID:o45 9259) 



P VALUE 



EMBL:D32434 comes from this 
gene; cDNA EST 
EMBL.D33710 comes from this 
gene; cDNA EST 
EMBL:D34467 comes from this 
gene; cDNA EST 
EMBL:D35005 comes from this 
gene; cDNA EST 
EMBL:D37535 comes from this 
gene; ... 

>gi|3878710|gnl|PID|el348373 
EST EMBL:D33710 comes 
from this gene; cDNA EST 
EMBL:D34467 comes from this 
gene; cDNA EST 
EMBL:D35005 comes from this 
gene; cDNA EST 
EMBL:D37535 comes from this 
g enc - - 



9e-15 



Newcastle disease 
virus isolate Herts/33 
matrix protein 
rnRNA, complete cds 



0.19 



3540281 



0.19 



2228750 



(AF056116) AIM related 
protein [Fugu rubripes] 



(U93868) RNA polymerase III 
subunit [Homo sapiens] 



6e-15 



2e-17 



622 | U894Q7 



Mus rnusculus strain 
BALB/c delta- 
aminolevulinic acid 
dehydratase (Lv) 
rnRNA, partial cds 



0.19 



1825764 



(U88314) C46H11.11 gene 
product [Caenorhabditis 
elegans] 



623 1 AFQ95598 



Bison bison 
athabascae 
microsatellite BBJ 2 



0.18 



<NONE> 



<NONE> 



<NONE> 



624 I AF064260 



Strongylocentrotus 
purpuratus SRC8 
rnRNA, complete cds 



0.18 



<NONE> 



<NONE> 



WO 01/02568 
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SEQ 
ID 


Nearest 
ACCESSIOr 


Neighbor (BlastN vs. 
M DESCRIPTION 


Genbank) 
P VALUE 


1 Nearest Neisr 
1 ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Vote ins) 
P VALUE 


625 


U69533 


Arabidopsis thaliana 
AtKAP alpha mRNA 
complete cds 


0.18 


J <NONE> 


<NONE> 


<NONE> 


626 1 


D89041 


Bovine DNA tor 
prostaglandin 
FZalpha receptor 
partial cds 


0.18 


I <NONE> 


<NONE> 


<NONE> 


627 1 


M24571 


Dictyostelium 
discoideum tRNA- 
GIu-GAA gene, clone 
yGluGAA7. 


0.18 


I <NONE> 


<NONE> 


<NONE> 


628 I 


X59772 


D.melanogaster ovo 
gene required for 
female germ line 
development 


0.18 


1 <NONE> 


<NONE> 


<NONE> 


629 1 


ALO 10209 


Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
rKUURhSS *** 
from contig 3-104, 
complete sequence 


0.18 


| <NONE> 


<NONE> 


<NONE> 


630 1 


U67575 


Methanococcus 
jannaschii section 117 
of 150 of the 
complete genome 


0.18 j 


111839 


inositol 1,4,5-triphosphate 
receptor 2 - rat 


8.5 


631 1 


U28730 


Caenorhabditis 
elegans cosmid 
K10B2 


0.18 I 


1787604 


(AE000232) orf, hypothetical 
protein [Escherichia coli] 


8.3 


632 1 


X99798 


L.lactis pepFl & 
pepF2 genes 


0.18 J 


3406624 


(AF079110) glycosomal malate 
dehydrogenase [Trypanosoma 
Drucei] 


8.1 


1 ] 
I [ 

633 1 AF025306 r 


Danio rerio band 4. 1- 
ike protein 4 (nbl4) 
tiRNA, complete cds 


0.18 f 


i 

} 

( 

465445 \ 


PROBABLE NUCLEAR 
ANTIGEN herpesvirus 1 (strain 
Kaplan) >gi|334072 (M34651) 
3RF-3 protein [Pseudorabies 
rims] 


7.9 


1 } 
1 1 

634 J AF059251 r 


vlus musculus 
ipoxygenase (alox) 
nRNA, complete cds 


0.18 


( 

1655667 F 


Z81368) hypothetical protein 
iv2393 


6.6 


635 J 


C 

222605 p 


j.domesticus CTCF 
rotein mRNA, 


0.18 | 


3 

481864 d 


-rnethyl-2-oxobutanoate 
ehydrosenase 


6.6 


1 I 

1 t( 

636 | AB01I0S6 p 


lomo sapiens mRNA 
3r KIAA0514 
rotein. complete cds | 


0.18 | 


c 

3874158 C 


281464) predicted using 
renefinder 


6.4 1 



311 
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SEC 
ID 


1 Nearest 
|ACCESSIOr 


Neighbor (BlastN vs. 

DESCRIPTION 
Caenorhabditis 


Genbank) 
P VALUE 


J Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 

1 ACCESSION DESCRIPTION L VALUE 


637 


1 Z78536 


elegans cosmid 
C07A4, complete 
sequence 
[Caenorhabditis 
elegans) 


0.18 


J (AJ01 1681) retinoblastoma- 
1 related protein [Chenopodium 
3702121 rubruml 


1 6.4 


638 


1 U67530 


Methanococcus 
jannaschii section 72 
of 150 of the 
complete genome 


0.18. 


j 8104)4) Weak similarity to 65 

1 KDA heat shock protein 

1 (TR:G602231); cDNA EST 

1 EMBL:D71705 comes from this 

I • gene; cDNA EST 

1 EMBL:D74382 comes from this 

J 3877946 gene TCaenorhabditis elevancl 


6.3 J 


639 


M63781 


Influenza 

A/Duck/England/1/62 
(H4N6) nucleoprotein 
mRNA, complete cds. 


0.18 1 


3873663 


(ZWW4) cDNAESl 
EMBL:D71510 comes from this 
gene; cDNA EST 
EMBL:C08449 comes from this 
gene; cDNA EST yk266bl2.3 
comes from this gene; cDNA 
EST yk266bl2.5 comes from 
this gene; cDNA EST 
yk461h7.3 comes from this 
gene;cDNA... 


6.2 f 


640 1 


M73781 | 


Oryctolagus 
cuniculus integrin 
beta-8 subunit 
mRNA, complete cds. 
> :: gb|I44828|I44828 
Sequence 3 from 
Datent US 5635601 


0.18 1 


1362129 


major allergen OLE17 - 
common olive 


5.8 


641 J 


] 

X67219 


S ■ lilt Itll lU^tlblCI AXUIJ 

icne , ■ 


0.18 ? 


3449286 


(AB01 1527) MEGF1 [Rattus 
norvegicus] 


4.8 1 


642 I 


I 
a 

AF 106941 c 


-lomo sapiens beta- 
rrestin 2 mRNA, 
omplete cds 


0.18 


548353 


IPROTEIN-PII] 
URIDYL YLTRANSFERASE 
vinelandii >gi|39257 (X59610) 
uridylyl transferase 


3.7 j 


643 1 


I 
h 

AF052602 n 


>anio rerio 
untingtin (HD) 
iRNA, complete cds 


0.18 1 


241058 


potential IGF binding protein j 
chickens, Peptide Partial, 77 aa, 
segment 2 of 3] 


3.6 J 
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SEC 
ID 


jl Neares 
2 

Iaccessio 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 1 Nearest Neig 
P VALUE 1 ACCESSION 


Nbor (BlastX vs. Non-Redundant ] 
1 DESCRIPTION 


^roteins) 1 
P Vat rrpl 


644 


I AB020709 


Homo sapiens mRN/ 
for KIAA0902 
protein, complete cds 


V 1 

0-18 J 3875570 


U,osji4) predicted using 
Genetinaer; cUN A 1 
EMBL:M75775 comes from th 
gene; cDNA EST 
EMBL:M89255 comes from thi 
' gene; cDNA EST 
EMBL:M89127 comes from thi 
gene; cDNA EST 
EMBL:T00141 comes from this 
gene; cDNA EST EMBL:T... 


IS 1 
s I 
s I 

2.1 1 


645 


1 AF096883 


HIV-1 isolate patient 
3 country USA pol 
polyprotein (poi) 
gene, partial cds 


0-18 I 3250696 


_ (AL0244S6) putative protein 


1.7 1 


646 


L39928 


Pyrocoeha miyako 
(clone pB-PmL41) 
luciferase mRN A, 
complete cds 


0.18 1 2914702 


(AC003974) unknown protein 
[Arabidopsis thalianal 


0.73 J 


647 


Mi 7082 


Human 

carcinoembryonic 
nonspecific 
crossreacting antigen 
(CEA; NCA) gene, 
exons 1 and 2. 


0-18 J 1351833 


REGULATORY PROTEIN 
ABAA 


0.72 I 


648 I 


X75318 


H.sapiens ITIH1 gene 
(exon 22) and ITIH3 
sene 


1 

0-18 I 629557 


NA-binding protein rnpD - 
Arabidopsis thaliana (fragment) 
>gi|5 10240 (X61108) RNA 
binding protein [Arabidopsis 
thaliana] 


0.41 


649 1 


1 

t 

( 

AF011908 c 


vlus musculus 
lpoptosis associated 
yrosine kinase 
AATYK) mRNA, 
omplete cds 


< 

0-18 I 330442 


K03332) nuclear antigen 2 
Epstein-Barr virus] 


5e 04 1 


650 j 


S 
ir 
v 
ei 

U04004 2t 


imian 

nmunodeficiency 
lrus SIVagmVER-2 
ivelope protein 
-ne. partial cds. ! 


I 

1 s 

1 

1 u 
1 E 

1 > 

1 (1 
1 11 

01S 135102 n 


USPAk'l YL-'lkNA 
i in i nt I A^h aspartate— 
ligase (EC 6.1.1.12)- 
scherichia coli coli] 
gi|1736513|gnl|PID|dl016401 
390829) Aspartate-tRNA 
gase (EC 6.1.1.12) 
Escherichia coli] 


6e 11 


651 1 


X 
R 

US8155 lac 


enopus laevis 
anGTPase 
jivating protein 


Ojj 1 995714 |rs 


C91258) pid:el98503 
accharomyces cerevisiae] 


2e-13 1 
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SEQ 
ID 


! Nearest 

APPPQ^TPlN 


Neighbor fBlastN vs. < 

t nccr'D roTTAM 
^ UcoLKlr 1 l(JiN 


3enbank) 
P VALUE 


S Nearest Neish 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
| DESCRIPTION 


roteins) 
P VALUE 


652 


1 ZL8921 


B.oleracea gene for S 
receptor kinase- like 
protein 


0.18 


1 3875535 


(z.t>to i 1 ) similar to noolcinase; 
cDNAES'l kMHL:D69553 " 
comes from this gene; cDNA 
EST EMBL:D65938 comes 
from this gene; cDNA EST 
yk280h9.3 comes from this 
gene; cDNA EST yk280h9.5 
comes from this gene; cDNA 
EST yk223dll.3 come... 


le-19 


653 


M60650 


S.cerevisiae STA2 
gene, complete cds. 


0.16 


| <NONE> 


<NONE> 


<NONE> 


654 


U80912 


eucalyptus globulus 
NADP-isocitrate 
dehydrogenase 
(EglCDH) mRNA, 
complete cds 


0.16 


1 3766172 


(AF057298) ornithine 
decarboxylase antizyme 2 [Mus 
musculus] 


4.2 


655 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.16 


76749 


hypothetical protein 4 - fowl 
adenovirus 1 


4.0 


656 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA. complete 
cds 


0.16 


3044086 


(AF055904) unknown 
[Myxococcus xanthus] 


0.60 j 


657 


AF030231 


Glycine max sucrose 
synthase (SS) mRNA, 
complete cds 


0.078 j 


<NONE> 


<NONE> 


<NONE> 


658 


c 

M19183 c 


VVoodchuck hepatitis 
✓irus (WHV), 
:omp!ete genome, 
rlone WHV 59. [ 


0.072 j 


< 

] 
( 

I 

1076190 1 


cell wall glycoprotein, 75K, 
precursor - diatom 
[Cylindrotheca fusiformis) 
>gi|5 15363 (X80394) P75K 
»ene product [Cylindrotheca 
us i form is] 


6.3 


659 


( 
I 
I 

E 

g 

F 

U31557 c 


Jvine adenovirus 
Va2 protein gene, 
DNA polymerase 
, r ene, terminal protein 
ene and 52.55 kDa 
rotein gene, partial 
ds 


0.072 j 


( 

3511143 r 


AF061244) unknown 
Agrocybe aegerita] 


6.2 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 

JLJLJ 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Caenorhabditis 










660 


AL02I491 


elegans cosmid 
Y44A6B, complete 
sequence 
[Caenorhabditis 
elegans] 


0.070 


<NONE> 


<NONE> 


<NONE> 


661 


M33874 


X.laevis Xotch 
protein mRNA, 
complete cds. 


0.070 


1654096 


(Y09076) RAD 3 
[Schizosaccharomyces pombe] 


0.23 


662 


ABO 12725 


Mus musculus 
ZAN75 mRNA for 
zinc finger protein, 
complete cds 


0.069 


1350800 


MITOCHONDRIAL 
RIBOSOMAL PROTEIN S5 


2.0 


663 


AL021491 


Caenorhabditis 
elegans cosmid 
Y44A6B, complete 
sequence 
[Caenorhabditis 
elegans] 


0.068 


<NONE> 


<NONE> 


<NONE> 


664 


Z60318 


H. sapiens CpG DNA, 
clone lei, reverse 
read cpglel.rla . 


0.068 


1280134 


(U55376) F16H11.2 gene 
product [Caenorhabditis 
elegans] 


2.6 


665 


Z35973 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR104w ! 


0.068 


2493000 


^KOBABLb NULLUM i L- 
COA:3-KETOACID- 
COENZYME A 

TRANSFERASE PRECURSOR 
EMBL:Z14816 comes from this 
gene; cDNA EST 
EMBL:Z 14946 comes from this 
gene; cDNA EST 
EMBL:D69746 comes from this 
gene; cDNA EST yk219b6.3 
comes from this gene; cDNA 
ES... 


0.6S 


666 


Z86111 


Streptomyces lividans 
rpsP, trmD, rplS, 
sipW, sipX, sipY, 
sipZ, mutT genes and 
4 open reading 
frames 


0.068 


1235974 


(X96713) collagen [Globodera 
pallida] 


4e : 04 ! 


667 


M72980 


Anthonomus grandis 
vitellogenin gene 
(VTG), complete cds. 


0.068 


3242750 


(AC005 164) match to ESTs 
AA731149 (NID:g2l4013S), 
AA73190S (NID:g2752719), 
AA287837 (NID:gl9335 19), 
AA26281 1 (NID:glS9S3S2), 
and AA825S20 (NID:g2S99132) 


Le-59 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



668 



M34161 



669 



L0381I 



DESCRIPTION 



Rat tachykinin (PPT) 
gene, exons 5 and 6. 



P VALUE 



0.067 



Aspergillus niger zinc 
finger protein (creA) 
gene, complete cds 



0.067 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 
ACCESSIO N | DESCRIPTION [p VALUE I 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



670 1 M64983 



671 I AFO 14051 



uman fibrinogen 
beta chain gene, 
'complete mRNA. > 
gb|I47706|I47706 
Sequence 3 from 
patent US 5639940 



0.067 



Nicotiana tabacum 
Mg chelatase subunit 
(ChiH) mRNA 
partial cds 



<NONE> 



<NONE> 



<NONE> 



0.067 



<NONE> 



<NONE> 



<NONE> 



672 



Y07540 



H.sapiens sil gene 



0.067 



glycoprotein GP330, renal - rat 
92331 (fragments) 



673 I AJ000347 



674 



Rattus norvegicus 
mRNA for 3'(2'),5*- 
bisphosphate 
nucleotidase 



0.067 



129238 



25 KD OOKINETE SURFACE 
ANTIGEN PRECURSOR 
(PRS25) >gi|320962|pir||A44966| 
|25k ookinete surface antigen 
precursor - Plasmodium 
|reichenowi reichenowi] 



7.5 



L19979 



675 1 X08050 



Squid sodium channel 
mRNA, complete cds 



0.067 



Yeast tRNA-Glu(3) 
gene and flanking 
regions 



2128473 



hypothetical protein MJ0750 
Methanococcus jannaschii 
>gi| 1592304 (U67521) 
Ifenredoxin-type protein 



7.4 



1.5 



0.067 



1334398 



(X15081) MURF2 protein (AA 
11-348) 



0.65 



676 



X17115 



Human mRNA for 
IgM heavy chain 
complete sequence 



0.067 



1731331 



HYPOTHETICAL ii. 6 KD 
PROTEIN CY49.14C 
>gi 1 1 37024 1 |gnl |PID|e247089 
(Z73966) hypothetical protein 
Rv2075c [Mycobacterium 
[tuberculosis] 



0.51 



677 I AF032871 



Homo sapiens 
uncoupling protein 3 
(UCP3) gene, exon 1 
and partial exon 2 



0.067 



112900 



ALPHA-2C-1 ADRENERGIC 
RECEPTOR human >gi|178194 
(J03853) kidney alpha-2- 
adrenergic receptor [Homo 
sapiens] >gi| 1628638 (U72648) 
a!pha2-C4-adrenergic receptor 
[[Homo sapiens] 



0.50 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












DYN AM IN 3 (DYNAMIN, 




678 


X05319 


Mouse class II MHC 
E-beta2 (d) gene 
exon 3 


0.067 


585074 


TESTICULAR) rat 
>gi|39 i 872|gnl|PID|d 1003668 
(D 14076) testicular dynamin 
[Rattus norvegicus] 


3e-04 


679 


AB006362 


Candida albicans 
CaSLNl gene, 
complete cds 


0.067 


3417296 


(AC003007) Unknown gene 
product (partial) [Homo sapiensl 


9e-56 


630 


AF02I236 


African horse 
sickness virus capsid 
VP3 (L3) mRNA, 
complete cds 


0.066 


<NONE> 


<NONE> 


<NONE> 


681 


AE001507 


Helicobacter pylori, 
strain J99 section 68 
of 132 of the 
complete genome 


0.066 


<NONE> 


<NONE> 


<NONE> 


682 


AF039717 


Caenorhabditis 
elegans cosmid 
R13H8 


0.066 


<NONE> 


<NONE> 


<NONE> 


683 


AF029027 


Syncerus caffer 
isolate Queen 
Elizabeth Mweya 14 
mitochondrial DNA 
control region 


0.066 


<NONE> 


<NONE> 


<NONE> 


684 


AF087967 


Homo sapiens full 
length insert cDNA 
clone YU51G05 


0.066 


2982476 


(X97203) CI protein [Beet curly 
top virus 1 


9.5 


685 


J02037 


Baboon endogenous 
virus proviral long 
terminal repeat DNA. 


0.066 


972767 


(L37868) POU-domain 
transcription factor [Homo 
sapiens] 


7.3 


686 


AF000141 


Lycopersicon 
esculentum class I 
knotted-like 
homeodomain protein 
(LeT6) mRNA, 
complete cds 


0.066 


3157926 


(AC0021.31) Strong similarity to 
extensin-like protein gb|Z34465 
from Zea mays. [Arabidopsis 
thaliana] 


5.6 


687 


AB001746 


Bensingtonia sp. 
OK255 gene for 18S 
rRNA > :: 

dbj|AB001747|AB00 
1747 Bensingtonia 
sp. OK259 gene for 
18S rRNA 


0.066 


3859889 


(AF070064) cap 'ri collar 
isoform C [Drosophila 
melandgaster] 


0.38 1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 5 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Helicobacter pylori, 










688 


AE001461 


strain J99 section 22 
of 132 of the 
complete genome 


0.065 


<NONE> 


<NONE> 


<NONE> 


689 


M30821 


Chicken erythroid 
transport proteins cl 
and c2 


0.065 


<NONE> 


<NONE> 


<NONE> 


690 


AB009802 


Homo sapiens gene 
for osteonidogen, 
intron 3 


0.065 


<NONE> 


<NONE> 


<NONE> 


691 


AF086062 


Homo sapiens full 
length insert cDNA 
clone YZ06B 1 1 


0.065 


<NONE> 


<NONE> 


<NONE> 


692 


AB002369 


Human mRNA for 
KIAA0371 gene, 
complete cds 


1 0.065 


2500884 


SIGNAL SEQUENCE 
BINDING PROTEIN binding 
protein [Synechococcus sp.] 


5.5 


693 


AF086864 


Cyclopodia sp. large 
subunit ribosomal 
RNA gene, 
mitochondrial gene 
for mitochondrial 
RNAs, partial 
sequence > :: 
gb|AF086866|AF086 
866 Penicillidia sp. 
large subunit 
ribosomal RNA gene, 
mitochondrial gene 
for mitochondrial 
RNAs, partial 
sequence 


0.065 


3721684 


(AB012957) probable glycosyl 
transferase [Vibrio cholerae] 


5.5 


694 


L44593 


Bacteriophage BK5-T 
ORF410, 3' end pf 
cds, 20 ORFs, 
repressor protein, and 
Cro repressor protein 
genes, complete cds, 
ORF70* gene, 5' end 
of cds. 


0.065 


1172067 


PEPTIDASE T 

( AMINOTREPEPTID ASE) 

influenzae Rd] 


3.2 


695 


U80079 


Ciona intestinalis 
MyoD-family protein 
(CiMDFa) mRNA, 
complete cds 


0.065 


4218110 


(AL035353) contains EST 
gb:F152Sl 


2.5 
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Nearest Neighbor (BlastN vs. G enbank) 

SEQ ' 

IP I ACCESSION 



DESCRIPTION 



696 I AB020718 



Homo sapiens mRNA 
for KIAA09U 
protein, complete cds 



P VALUE 



Nearest Neigh bor (BlastX vs. Non-Redundant Prote .no 



ACCESSION 



0.065 



1722734 



DESCRIPTION 



MINOR CAPSID PROTEIN L2 
>gi|I020192 type 231 



P value! 



697 | AF082137 



699 



Zea mays copia-Iike 
retrotransposon Stl- 
14 leader region, 
partial sequence 



698 | X64053 



R.norvegicus ZnBP 
gene for zinc binding 
protein 



U67065 



Mus musculus 
butyrophilin (BTN) 
gene, promoter region 
and complete cds 



0.065 



1877501 



0.065 



464963 



(U89278) polyhomeotic 2 
homolog [Homo sapiens] 



1.1 



TRYPSIN PRECURS OR 



0.36 



0.065 



2132252 



hypothetical protein YPL263c • 
yeast 



700 | M64862 



701 | K02205 



Rat matrin F/G 
mRNA, complete cds. 



Yeast (S.cerevisiae) 
transcriptional 
activator of amino 
acid-biosynthetic 
genes (GCN4) gene, 
complete cds. 



0.065 



3420183 



(AF041 105) organic anion 
transporter protein 3 [Rattus 
norvegicus] 



4e-19 



0.064 



<NONE> 



<NONE> 



702 1 X58282 



Maize mRNA for a 
high mobility group 
protein 



0.064 



<NONE> 



<NONE> 



703 J AC001545 

704 I AF023461 



705 I U50307 



Homo sapiens 
(subclone 1J3 from 
PI H69) DNA 
sequence 



0.064 



Homo sapiens 
FRA3B region 
sequence 



<NONE> 



0.064 



706 | U46542 



Caenorhabditis 
elegans cosmid 
F43H9. 



<NONE> 



0.064 



Streptococcus crista 
HmpA gene, partial 
cds, putative 
adhesin/ABC 
transport system 
protein (scbA) gene, 
complete cds 



<NONE> 



0.064 



1209391 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



(D83659) TPR protein pombe] 
>gi|2894282|gnl|PID|el251103 
(AL021838) pre-mrna splicing 
factor. [Schizosaccharomyces 
:>ombe] 



707 



X57564 



A.rusticana mRNA 
for neutral peroxidase 



0.064 



1492037 



(U60315) MC094R [Molluscum 
contagiosum virus subtype 1] 



6.9 
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SEC 
ID 


in cure st 

; 1 

1 ACCESSION 


Neighbor (BlastN vs. < 
sT DESCRIPTION 


jenbank) 
P VALUE 


i Nearest Nei^h 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 1 
P VAT T rpl 


708 


! U06986 


Human alpha-2- 
macrogiobulin 
receptor/Iipoprotein 
receptor protein 
(A2MR/LRP) gene, 
exons 39-41. 


0.064 


100800 


rab!5B protein - wheat 
>gi|21853 (X62476) rab protein 
[[Triticum aestivum] 


5 3 


709 


1 D85773 


Human CpG island 
sequence, clone 
Q28B8 


0.064 


2245382 


|(U88325) suppressor of 
cytokine signalling-l [Mus 
Imusculus] 


5.3 1 


710 


I L06178 


Apis mellifera 
ligustica complete 
mitochondrial 
genome 


0.064 


J(AhUy03/Uj contains similarity 
(to a C. elegans hypothetical 
protein F44G4.1 (GB:Z49910) 
land several yeast hypothetical 
proteins such as 35. 1 KD 
protein in NAM8-GAR 1 
intergenic region (SP.P38805) 
3695379 fArabidopsis thalianal 


3 1 1 


711 


Y 16242 


Triticum aestivum 
mRNA for beta- 
amylase 


0.064 


H^PUIHEIICAL ;u.d KD 
PROTEIN IN AGP3-DAK3 
INTERGENIC REGION 
>gi|1084712|pir||S56201 
[probable membrane protein 
YFL054c - yeast 
|(Saccharomyces cerevisiae) 
>gi|83670 1 |gnl|PID|d 1009825 
1175958 (D506171 YFL054C 


3.1 | 


712 J 


L81779 


Homo sapiens 
(subclone 2_a2 from 
PI H25) DNA 
sequence 


0.064 


3845169 j 


(AE001391) phosphatase (acid 
phosphatase family) 


n 9 1 i 

U.O l | 


713 1 


X13826 < 


Creinhardtii psbl 
mRNA for OEE1 
protein of 
photosystem II 
^oxygen-evolving 
snhancer protein) 


0.064 


< 

171040 | 


(M94535) ATPase 
Saccharomyces cerevisiae] 
:erevisiae, Peptide, 377 aa] 
Saccharomyces cerevisiae] 


0.054 1 


714 1 


] 

X06487 t 


-Lsapiens mRNA for 
)d2-Ig fusion gene 


0.064 


( 

2429362 If 


AF020261) proline rich protein 
Santalum album] 


0.016 


715 1 


J 
c 
i 
( 

U79638 e 


Aus musculus cyclin- 
ependent kinase 
nhibitor protein 
pl5(lNK4b)) gene, 
xon 2 and partial cds 


0.064 


( 

la 

3929221 |p 


AF0S2557) TRFl-interacting 
nkyrin-related ADP-ribose 
olymerase [Homo sapiens] 


le-10 | 
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SEQ 
ID 


Nearest 

> 

ACCESSION 


Neighbor (BlastN vs. 
4 DESCRIPTION 


3enbank) 
P VALUE 


1 Nearest Neieh 
f ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) H 
P VALUE 


716 


U39099 


Human T cell 
receptor alpha chain 
mRNA, partial cds 


0.063 


1 <NONE> 


<NONE> 


<NONE> 


717 


U39673 


Clostridium 
acetobutylicum KdpC 
(kdpC) gene, partial 
cds, sensor histidine 
kinase homolog 
(kdpD) and response 
regulator homolog 
(kdpE) genes, 
complete cds 


0.063 


I <NONE> 


<NONE> 


<NONE> 1 


718 


AL022317 


Human DNA 
sequence from clone 
140L1 on 

chromosome 22ql3.1- 
1331, complete 
sequence [Homo 
sapiens] 


0.063 


1931640 


(U95973) Serine 
carboxypeptidase isolog 
[Arabidopsis thaliana] 


52 


719 


U28972 


Spiroplasma citri orfa 
and orff genes, partial 
cds, orfb, orfc, and 
orfe genes and 
Spiroplasma virus 
SpVl-derivedORFl 
and ORF3 genes, 
complete cds, and 
SpVl-derived ORF14 
gene, partial cds. 


0.063 


4091939 


(AF070704) envelope 
glycoprotein [Human 
immunodeficiency virus type 1] 


52 I 


720 


1 

J 

U15159 < 


Mus musculus limk 
kinase (limk) mRNA, 
:omplete cds 


0.063 1 


3638957 


(AC004S77) sco-spondin-mucin- 
ike; similar to P98167 uncertain 
"Homo sapiens] 


5.1 I 


721 


] 
1 
r 

( 

AF058416 a 


4omo sapiens 
ipoprotein receptor- 
elated protein 
LRP1), exons 39, 40, 
ind41 


0.063 1 


( 

1788123 


AE000276) orf, hypothetical 
)rotein [Escherichia coli] 


4.0 


722 


I 

f 

c 
s 

t 

AE001430 s 


'lasmodium 
alciparum 
hromosome 2, 
ection 67 of 73 of 
he complete 
equence 


0.063 I 


2244849 ( 


Z97337) hypothetical protein 


4-0 1 



WO 01/02568 



PCT/US00/18374 





" | Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proline* 


SEC 
ID 


lACCESSIOr 


4 DESCRIPTION 
Streptococcus 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


723 


1 L29323 


pneumoniae methyl 
transferase gene 
cluster, complete 
sequence 


0.063 


3874022 


(Z70203) cDNA EST 
EMBL:D72339 comes from this 
gene; cDNA EST 
EMBL:D75197 comes from this 
gene [Caenorhabditis elesansl 


2.3 


724 


X72631 


H. sapiens mRNA 
encoding Rev- 
ErbAalpha > :: 
ernb|X72632|HSREV 
ERB2 H.sapiens 
mRNA encoding Rev 
ErbAalpha (internal 
fragment) 


0.063 


3979878 


(UMOb) predicted using 
Genefinder; cDNA EST 
EMBL:T01277 comes from this 
gene; cDNA EST 
EMBL:T01796 comes from this 
gene; cDNA EST 
EMBL:D32545 comes from this 
gene; cDNA EST 
EMBL:D33060 comes from this 
gene; cDNA EST EMBL:D... 


1.7 


725 


U17969 


Human initiation 
factor eIF-5A gene, 
complete cds. 


0.063 


2429509 


(AF025467) contains similarity 
to drosophila DNA-binding 
protein K10 (NID:g8148) 
[Caenorhabditis ejegans] 


1.4 


726 


AEOOiOOO 


Archaeoglobus 
fulgidus section 107 
of 172 of the 
complete genome 


0.063 


3462802 


(AF082486) nef protein [Human 
immunodeficiency virus type 1] 


035 1 


727 


S80986 


svp[40]=svp-related 
nuclear 

receptor/retinoid 
signaling modulator 
[zebrafishes, mRNA, 
3876 nt] 


0.063 


1326288 


(U58734) weak similarity to 
ankyrin G [Caenorhabditis 
elegans] 


0.093 1 


728 1 


AF109134 


Homo sapiens 7-60 
mRNA, complete cds 


0.063 


• 1083764 


proline-rich proteoglycan 2 
precursor, parotid - rat 
>gi|3 10200 (LI 73 18) proline- 
rich proteoglycan [Rattus 
norvegicus] 


0.001 


729 1 


] 
3 

D87466 | 


Human mRNA for 
KIAA0276 gene, j 

JiXl L 1 Ul \^ Uo 




287986:) | 


IAL021816) SPBC24E9.03c, 
jnknown, Ien:25Iaa 
Schizosaccharomyces pombe] 


6e-05 I 


730 


I 

f 

ABO 1 8269 F 


-lomo sapiens mRNA 
or KIAA0726 
>rotein, complete cds 


0.063 


( 

2995865 f 


AF053455) tetraspan TM4SF 
Homo sapiens] 


2e-16 I 


731 1 


C 
r 
C 

DS6954 2 


>icetulus griseus 
nRNA for 
Cytochrome P-450 
A 14, complete cds 


0.063 


I 

F 
C 

t. 

2496896 \ 


i YPOTHETIC AL 47.6 KD 

protein c16c10.5 in 
:hromosome hi 

gi|3S743S3|gnI|PID|e 1 344077 
ypc (RING finger) 
Caenorhabditis eleaans] ! 


le-22 | 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Gcnbank) 



SEQ| 
ID 



ACCESS ION I DESCRIPT ION 
Plasmodium 



732 | ALQ 10232 



733 | U90714 



Nearest Neigh bor (BlastX vs. Non-Redundant Pm^in ^" 



P VALUE 



falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 4-58, 
complete sequence 



ACCESSION 



DESCRIPTION 



Mycoplasma 
gallisepticum 
haemagglutinin 
precursor genes, 
complete cds 



0.062 



<NONE> 



<NONE> 



734 



[Homo sapiens clone 
pCL4 DNA-binding 
[protein SOX21 
(SOX21)gene, 

AF 107044 complete cds 

Caenorhabditis 



0.062 



<NONE> 



<NONE> 



_P VALUE I 



<NONE> 



0.062 



<NONE> 



<NONE> 



735 | L41729 



736 



Z99287 



Jelegans Ro 
Iribonucleoprotein 
autoantigen mRNA, 
(complete cds I 0.062 



[Caenorhabditis 
Jelegans cosmid 
1Y7A9D, complete 
I sequence 
[[Caenorhabditis 
elegans] | 0.062 



(AE000687) putative protein 
2983060 IfAquifex aeolicus] 

[SERINE/TKREONINE- 
(PROTEIN KINASE D 1044.3 
IN CHROMOSOME III 
[>gi|495684 (U00065) contains 
lEGF-like repeats; highly similar 
[to ZC84.1; 3' exons similar to 
Iprotein kinase [Caenorhabditis 
1 1 76542 [elegans] 



737 1 ABQ14514 



738 



L29165 



Homo sapiens mRNAj 
|forKIAA0614 

protein, partial cds | 0.062 
iuman germline 
(immunoglobulin light J 
[chain variable region 
(lambda-IIIb 
[subgroup) from IgM 
[rheum atoid factor. J 0.062 



4033395 



739 I UQ9364 



Schistosoma 
(japonicum Chinese 
[clone pY6 
Iparamyosin mRNA, 

partial cds. 



1914685 



740 



Y 1 6242 



iTriticum aestivum 
I mRNA for beta- 
[amylase 



0.062 



1350800 



0.062 



79834 



DNA GYRASE SUB UNIT B 
subunit [Myxococcus xanthus] 



(Y12014) RAD23 protein, 
isoform II 



<NONE> 



<NONE> 



8.6 



5.8 



3.9 



MITOCHONDRIAL 
RIBOSOMAL PROTEIN S5 



hypothetical protein 1246 (uvrA 
region) - Micrococcus luteus 
(fragment) 



1.3 



1.3 



0.59 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BiastX vs. Non-Redundant Protein^ 


SEQ 
ID 


ACCESSIONS 


F DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Leishmania pifanoi 






TROPOMYOSIN 1 (TMI) 




741 


M97695 


cysteine proteinase 
(cys2) gene, complete 
cds. 


0.062 


1174754 


(POLYPEPTIDE 49) 
>gi|320989|pir||A60607 
tropomyosin - fluke 


0.018 j 


742 


U67526 


Methanococcus 
jannaschii section 68 
of 150 of the 
complete genome 


0.062 


1330345 


coded tor by L. 
elegans cDNA yk34bl.5; coded 
for by C. elegans cDNA 
yk!3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded 
for by C. elegans cDNA 
yk46d5.5; coded for by C. 
elegans cDNA yk43c2.5; coded 
for by C. elegans cDNA 
yk46e8.... 


le-40 


743 


Z78414 


Caenorhabditis 
elegans cosmid 
W09D 12, complete 
sequence 
[Caenorhabditis 
elegans] 


0.061 


<NONE> 


<NONE> 


<NONE> 


744 


Y 13606 


Mus muse ul us gene 
encoding filensin, 
exons 6, 7 


0.061 


2314715 


(AE000651) H. pylori predicted 
coding reaion HP 1527 


4.9 


745 


J04374 


Eggplant mosaic 
virus genome. 


0.061 


141449 


H YPO I'litTlC AL 35.5 Kt) 
PROTEIN IN TRANSPOSON 
TN4556 >gi|80759|pir||JQ0431 
hypothetical 35. 5K protein - 
Streptomyces fradiae transposon 
Tn4556 


3.8 


746 


AB022200 


Marine obligateiy 
oligotrophic 1 
bacterium POO- 10 
DNA for 16S. 
ribosomal RNA, 
partial sequence 


0.061 


3983593 


(AB000307) transcarboxylase- 
beta 


2.2 


747 


X54250 


ival 1 1 IXxi^i r\ lui Lll\\, 

finger protein AT- 
BP2, partial cds 


0.061 


1377886 


(L46S15) DNA binding protein 
R.c [Mus muscuius] 


0.9S 


748 


X69942 


M.musculus mRNA 
of enhancer-trap- 
locus 1 


0.061 


2983969 


(AE00074S) putative protein 
Aquifex aeolicus] 


0.57 


749 


AJ223206 i 


VIus muscuius mRNA 
: or scrapie responsive 
protein 1 


0.061 


4204265 


[AC005223) 45643 
Arabidopsis thaliana] 


5e-31 


750 


] 

Y 10205 ( 


-I. sapiens mRNA for 
3DS8 protein j 


0.060 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor fBlastN vs. Genhankl 



ACCESSION 



DESCRIPTION 



751 



U79260 



752 1 X07453 



Human clone 23745 
mRNA, complete cds 



753 



U57502 



754 1 X68359 



755 



X51634 



Plasmodium 
falciparum 1 l-l gene 

pan 1 

Rattus norvegicus 
protein tyrosine 
phosphatase delta 
gene, catalytic 
domain, partial cds. 



Mfascicularis gene 
for apolipoprotein C- 

III 

Pseudomonas braB 



gene for branched 
chain amino acid 
transport carrier (LIV 
ID 



P VALUE 



_Nearest Neighbor (BlastX vs. Non-Redundant ProtemTT 



ACCESSION 



0.060 



<NONE> 



0.060 



<NONE> 



0.060 



3452285 



0.060 



730843 



0.059 



1835622 



DESCRIPTION 



<NONE> 



<NONE> 



(AF044915) polar tube protein 
PTP55 precursor 



SHUTTLE CRAFT PROTEIN 
>gi|487400 



P VALUE 



<NONE> 



0.28 



2e-04 



(U85718) CCML [Pseudomonas 
putida GB-11 



756 I AF072405 



Gossypium hirsutum 
cotton fiber expressed 
protein 2 (CFE2) 
mRNA t complete cds 



0.059 



423766 



Sambucus nigra 
ribosome inactivating 
(protein precursor 
757 1 AFQ12899 mRNA, complete cds 



758 I AF093268 



759 l_X61046 



Rattus norvegicus 
homer- lc mRNA, 
comple te cds 



0.056 



2662481 



0.054 



Hydra N-COL 2 
mRNA for mini- 
collagen, partial cds 



547847 



0.053 



<NONE> 



alkaline phosphatase, 145K - 
Synechococcus sp. 



4.7 



(AF034859) juvenile hormone 
resistance protein 



LECTIN PRECURSOR 



7.0 



760 | AJ005813 



761 I S79S43 



Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 



0.052 



<NONE> 



{random amplified 
hybridization 
microsatellite 
RAHM] [Beta 
vulgaris=sugar beets. 
Genomic, 537 nt] 



0.025 



1730145 



<NONE> 



<NONE> 



GAMETOGENESIS 
EXPRESSED PROTEIN GEG- 
154 >gi|2137331|pirj|I48361 
gene GEG-154 protein - mouse 
>gi|550I23 (X71642) 
pid:g550123 [Mus musculusl 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



762 | AB000096 



763 



Z62366 



DESCRIPTION 



Mouse mRNA tor 



GATA-2 protein, 
complete cds 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteu^sT 



ACCESSION 



0.023 



H.sapiens CpG DNA. 
clone 67h7, forward 
read cpg67h7.ftla 



<NONE> 



0.023 



3123312 



DESCRIPTION 



|P value! 



<NONE> 



ZINC FINGER PROTEIN 142 
(KIAA0236) to Human zinc 
finger protein(ZNF142) [Homo 
sapiens] 



<NONE> 



764 I LI 1670 



i-Iuman 

transmembrane 
glycoprotein (CD53) 
gene, exons 2 through 
8. 



5.9 



0.023 



80636 



hypothetical 67K protein - 
Mycobacterium fortuitum 
plasmid pAL5000 >gi| 149986 
(M60875) ORF2 



766 



3.4 



765 I D83984 



X98890 



Sulculus diversicolor 
DNA for IDO-Iike 
myoglobin, complete 

cds 

S. tuberosum mRNA 



0.023 



3114665 



(AF061267) inner membrane 
component HtxE [Pseudomonas 
stutzeri] 



3.4 



for inorganic 
phosphate 
transporter, StPTl 



0,023 



683532 



(X021 55) thyroglobulin [Bos 
taurus] 



767 | U58835 



769 



770 



Dissostichus mawsoni 
preprotrypsin gene. 
complete cds 



0.022 



768 1 AJ009630 



Glomus versiforme 
chitin synthase gene 
(clone Gvchs3) 



<NONE> 



<NONE> 



<NONE> 



0.022 



<NONE> 



J04040 



X74908 



771 I L07293 



Human glucagon 
mRNA, complete cds 
L.esculentum Asr3 



0.022 



<NONE> 



Shigella dysenteriae 
O-antigen 
polysaccharide 
biosynthesis rfbX, O- 
antigen polymerase 
(rfc), rhamnosyl 
tranferase I and II 
(rfbR and rfbQ) and 
rfbD genes, complete 
cds. 



0.022 



<NONE> 



0.022 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


1 iNearesi 
) 1 

1 ACCESSIOr 


Neighbor (BlastN vs. 

A DESCRIPTION 
Mus rnusculus 


Genbank) 
P VALUE 


f Nearest Neiph 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


772 


1 AF040094 


inositol " 
polyphosphate 5- 
phosphatase II 
(INPP5P) mRNA, 
complete cds 


0.022 


1 <NONE> 


<NONE> 


<NONE> 


773 


1 X76776 


H.sapiens HLA-DMI 
gene 


\ 

0.022 


| <NONE> 


<NONE> 


<NONE> 


774 


1 AE001521 


Helicobacter pylori, 
strain J99 section 82 
of 132 of the 
complete genome 


0.022 


1 <NONE> 


<NONE> 


<NONE> 


775 


XI 6004 


A.longa rbcL, rp!5, 
rps8 t rpl36, rpsl4, 
rps2, trnI,trnF, trnC 
and rpoB (partial) 
genes > :: 

emb|X75651|ALRIBP 
AJonga plastid genes 
for ribosomal 
proteins, tRNAs, 
RNA polymerase 
subunit beta and 
rubisco large subunit 


0.022 | 


<NONE> 


<NONE> 


<NONE> 


776 J 


Y 12707 


Lactococcus lactis 
cremoris plasmid 
pHW393 DNA, 
rlladii, mlladii genes 


0.022 1 


<NONE> 


<NONE> 


<NONE> 


777 J 


U271I8 


Arabidopsis thaliana 

glutamyl-tRNA 

reductase 


0.022 


<NONE> 


<NONE> 


<NONE> 


778 1 


] 

J 

< 

Z96622 t 


^.sapiens telomeric 
DNA sequence, clone 
5PTEL002, read 
>PTELOO002.seq 


0.022 J 


( 

191333 s 


J05503) carbamoyl-phosphate 
ynthetase (E.C.6.3.5.5) 


9.8 


779 1 


I 

n 

D83984 c 


Julculus diversicolor 
)NA for IDO-Iike 
nyoglobin, complete 
ds 


0.022 I 


F 

1078509 ^ 


robable membrane protein 
'DR01 8c - yeast 


9.7 


780 1 


P 

c 
F 

Z77952 S 


I. sapiens flow-sorted 
hromosome 6 
lindlll fragment, 
C6pA4A3 


0.022 1 


( 

a 

4204206 s 


AB0227S6) N-acetyl-beta-D- 
lucosaminidase [Enterobacter 
P-1 


7.5 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



781 



M10217 



DESCRIPTION 



P VALUE 



Nearest Nei g hbor (BlastX vs. Non-Redundant Pro tei ns ) 



ACCESSION 



Xenopus iaevis 



mitochondria] DNA, 
complete genome. 



782 I M55I47 



783 1 X58839 



784 1 M26185 



Pea chloroplast 
glyceraldehyde-3- 
phosphate 
dehydrogenase 
(Gpbl) gene, 
complete cds. 



0.022 



21457.63 



DESCRIPTION 



B2I68_C2_205 protein - 
Mycobacterium leprae 



Acholeplasma virus 
MV-L1 DNA for 
complete circular 
genome 



Mouse c-myb 
oncogene, exon 1 and 
exon 2 (partial). 



0.022 



417308 



0.022 



3273189 



PROBABLE HELICASE 
MOT1 Motlp is a probable 
helicase essential for vegetative 
growth on rich glucose medium 
at 30 degree C: Swiss-Prot 
Accession number P32333 
similar to S. cerevisiae RAD26 
gene product: Swiss-Prot 
Accession number P40352 



P VALUEI 



7.3 



(AB008757) subunit II of 
c(o/b)3-type cytochrome c 
oxidase [Bacillus 
stearothermophilusj 



0.022 



138592 



Vl'lbLLOGENlNI 

PRECURSOR (YOLK 
PROTEIN 1) 
>gi|72270|pir||VJFFl 
vitellogenin I precursor 
unnamed protein product 
[Drosophila melanogaster] 



4.2 



4.1 



785 I AF061195 



Streptomyces albus 
valine dehydrogenase 
(Vdh) gene, complete 
cds 



0.022 



2088768 



(AF003145) B0414.8 gene 
product [Caenorhabditis 
elegans] 



786 | AFQ53622 



Homo sapiens alpha 
1,2-mannosidase IB 
gene, exon 9 



Z71500 



S. cerevisiae 
chromosome XIV 
reading frame ORE 
YNL224c 



0.022 



1352361 



EARLY GROWTH 
RESPONSE PROTEIN 1 fish 
>gi|531456 (U12S95)egrl 
Danio rerio] rerio] 



0022 



1708875 



PUTATIVE TUMOR 
SUPPRESSOR LUCA15 
sapiens] 



0.86 



0.36 



0.16 



D10471 



Herpes simplex virus 
type 2 genomic DNA 
for 0.74-0.84 region, 
compiete cds 



0.022 



3132276 



(AB01 14S6) short ORF [TT 
virus] 



0.13 



U43082 



Zea mays T 
cytoplasm male 
sterility restorer 
factor 2 (rf2) mRNA, 
complete cds 



0.022 



3319720 



(AL031035) putative aldehyde 
dehydrogenase [Streptomyces 
coelicolbr] 



0.0 II 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


Neares 

} 

ACCESSIOl 


Neighbor CRHctlM vc 
^ DESCRIPTION 


Genbank) 

P VALUE 1 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) ' 

ACCESSION 1 DESCRIPTION PVAMrp 


790 


I X86913 


H.sapiens simple 
tandem repeat DNA 
(clone wg3a6) 


0.021 1 


<NONE> 


! <NONE> 


<NONE> 


791 


1 AF 1QQ694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.021 ! 


<NONE> 


J <NONE> 


<NONE>| 


792 


1 U34016 


Nannostomus sp. 
large subunit rRNA 
gene, mitochondrial 
gene encoding 
mitochondrial rRNA, 
partial sequence. 


0.021 1 


<NONE> 


1 <NONE> 


<NONE> 1 


793 


X00845 


Yeast mitochondrial 
genes for 15S rRNA 
and tRNA-Trp 


0.021 1 


<NONE> 


1 <NONE> 


<NONE> I 


794 


' AB012113 


Homo sapiens gene 
for CC chemokine 
PARC precursor, 
complete cds 


0.021 1 


<NONE> 


<NONE> 1 


<NONE>| 


795 


U62395 


Daucus carota 
globulin-like protein 
(Gea8) gene, 
complete cds 


0.021 


<NONE> 


<NONE> 




796 1 


M22718 


P. falciparum actin II 
gene, complete cds. 


0.021 I 


2623773 | 


(AF004835) tyrocidine 
synthetase 3 [Brevibacillus 
^revisl 


8 8 J 


797 1 


U27118 


Arabidopsis thaliana 

glutamyl-tRNA 

reductase 


0.021 I 


(AJ00663I)cysteine-rich 
secretory protein- 1 [Equus 
3549885 caballusl 


8.8 1 


798 


X99832 


H.sapiens CLN3 
gene, complete CDS 


0.021 I 


(S52010) orfl 5' of EpoR [mice, 
262249 Peptide. 85 aal TMus cn.l 


R 7 I 


799 I 


] 
i 

AFO 16266 c 


-lomo sapiens TRAIL 
eceptor 2 mRNA, 
:omplete cds 


0.021 1 


|< 

( 
1 

729048 T 


5UCCINYL- 
:OA:COENZYME A 
TRANSFERASE transferase 
Clostridium kluyveri] 


O. 1 | 

8.7 1 










I 


-1POPUL y SACCHARIDE" TT- " 
I- 

iCETYLGLUCOSAMINETR 
,NSFERASE>gi|466761 
J00039) rfaK [Escherichia 
Dli] >gi|1790053 (AE000440) 
robably hexose transferase; 
popolysaccharide core 
osynthesis 




800 1 


I 

s 
1 
r 
1 

h 

292541 h 


luman DNA 
equence from PAC 
79115, BRCA2 gene 
igion chromosome 
3ql2-13 contains 
ictase-phlorizin 
ydrolase (LCT) 


0.021 S 


/ 

A 
0 
c 

P 

585820 b 


5.3 J 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

tD I ACCESSION 



801 



S58588 



802 1 M6Q522 



803 | AF045654 



804 | M69023 



DESCRIPTION 



P VALUE 



dopamine D2 



receptor [human 
brain, Genomic, 3794 
nt. segment 4 of 5] 



Rat nerve growth 
factor-inducible 
protein (VGF) gene, 
complete cds. 



Callus gallus 
neuregulin beta- la 
mRNA, complete cds 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins! 



ACCESSION 



0.021 



2677620 



0.021 



4103934 



DESCRIPTION 



(Y08029) NAD(P)(+)~arginine 
ADP-ri bosy ! transferase 
[Oryctolagus cuniculus] 



P VALUE! 



0.021 



2746829 



Human globin gene. 



0.021 



3880259 



(AF03OO50) replication factor C 
[Rattus norvegicus] 



(AF040647) No definition line 
found [Caenorhabditis elegans] 



(282056) T26H5.8 
[Caenorhabditis elegans] 
>gi|3880787|gnI|PID|e 1 350288 
(AL032620) T26H5.8 



5.1 



3.1 



3.0 



Z65960 



H.sapiens CpG DNA, 
clone 69d2, reverse 
read cpg69d2.rtlb . 



0.021 



1707245 



X97073 



A.oligospora gene 
encoding lectin 
. melanogaster 



0.021 



116949 



807 I X56491 



L78760 



mRNA for gene 
containing opa 
repetitive element 



Homo sapiens 
(subclone l_f6 from 
PI H3 1 ) DNA 
sequence 



809 I AB007864 



Homo sapiens 
KIAA0404 mRNA, 
partial cds 



0.021 



2842750 



(U80845) similar to family 1 of 
G-protein coupled receptors 
Caenorhabditis elegans] 



CORE ANTIGEN 
>gi|73601|pir||NKVLC2 core 
antigen - woodchuck hepatitis 
virus2>gi|336135 



0.021 



113671 



HOiMEOBOX PROTEIN DLX 
7>gi[l620520 



! ! ! ! ALU CLASS F WARNING 
ENTRY !!!! 



0.021 



118144 



ACETYLSERINE 
SULFHYDRYLASE A) (O- 
ACETYLSERINE (THIOL )- 
LYASE A) (CSASE A) 
>gi|68323|pir||SYEBAC cysteine 
synthase (EC 4.2.99.8) A - 
Salmonella typhimurium 
>gi| 153935 (M21450) cysK 
protein [Salmonella 
typhimurium] 



0.79 



0.47 



0.16 



0.15 



0.12 



810 | AL021932 



Mycobacterium 
tuberculosis H37Rv 
complete genome; 
segment 22/162 



0.021 



2909514 



(AL021932) hypothetical 
protein RvQ439c 



7e-10 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















811 


U8999! 


Hypocrea jecorina 
mannose- 1 -phosphate 
guanylyltransferase 
(MPG1) mRNA, 
complete cds 


0.021 


3581924 


(AL03 1538) mannose- 1- 
phosphate guanyl transferase 
[Schizosaccharomyces pombe] 


6e-20 


812 


X00641 


Sugar beet 
mitochondrial 
minicircle pO 
sequence 


0.020 


<NONE> 


<NONE> 


<NONE> 


813 


Z50097 


D.melanogaster 
mRNA for hdc 
protein. 


0.020 


<NONE> 


<NONE> 


<NONE> 


814 


AF044866 


Phoebis sennae large 
subunit ribosomal 
RNA gene, partial 
sequence; tRNA-Val 
gene, complete 
sequence; and small 
subunit ribosomal 
RNA gene, partial 
sequence, 

mitochondrial genes 
for mitochondrial 
RNAs 


L 0.020 


<NONE> 


<NONE> 


<NONE> 


815 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


816 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


817 


AE001405 


Plasmodium 
falciparum 
chromosome 2, 
section 42 of 73 of 
the complete 
sequence 


0.020 


2196776 


(AF003342) bunched gene 
product [Drosophila 
melanogaster] 


8.4 


818 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.020 


627071 


histidine-rich protein - 
Plasmodium lophurae 


2.S 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















819 


Y13304 


Hylobates hoolock 
mitochondrial DNA 
for cytb gene, Horace 


0.020 


285580 


(D 10043) ORE [Acetobacter 
pasteurianus] 


2.1 | 


820 


Z66539 


H. sapiens creatine 
transporter gene 


0.020 


1703594 


(uau4jyj coaea ror by U. 
elegans cDNA yk7c8.5; coded 
for by C. elegans cDNA 
yk!33b3.5; coded for by C. 
elegans cDNA yk65a4.5; coded 
for by C. elegans cDNA 
yk7c8.3; coded for by C. 
elegans cDNA CEESQ66F; 
coded for by C. elegans cDNA 
yk65a4.3;... 


0.98 


821 


AF053622 


Homo sapiens alpha 
1,2-mannosidase IB 
gene, exon 9 


0.020 


1352361 


EARLY GROWTH 
RESPONSE PROTEIN 1 fish 
>gi|531456 (U12895) egrl 
[Danio rerio] rerio] 


0.72 


822 


M20555 


Human MHC class II 
HLA-DRw53-beta 
(DR4,w4) gene, 
exons 2,3,4,5,6. 


0.020 


465569 


j-i i ru i ni2 i jo. l iSJ^ 
PROTEIN IN SBCB-HISL 
INTERGENIC REGION 
>gi|405956 (U00009) 
ORF_ID:o349#4; similar to 
[SwissProt Accession Number 
P33015] [Escherichia coli] 
>gi|1736693|gnl|PID|dl016570 
Number P33015] [Escherichia 
coli] >gi| 1788323 (AE000292) 
putative transport system 
permease protein [Escherichia 
coli] 


0.43 


823 


M20555 


Human MHC class II 
HLA-DRw53-beta 
(DR4,w4) gene, 
exons 2,3,4,5,6. 


0.020 


1709751 


COENZYME PQQ 
SYNTHESIS PROTEIN F 
synthesis F - Pseudomonas 
fluorescens >gi|929802 


0.42 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 



824 1 AJQ05015 



825 | AF034Q99 



P VALUE | ACCESSION 



DESCRIPTION 



Homo sapiens mRNA 
for putative SMC-like 
protein, partial 



0.020 



267449 



uirulllLllL.-\L 1_.J IVL/ 

CHROMOSOME III 
>gi|102507|pir||S15787 
hypothetical protein 1 (cosmid 
ZK637) - Caenorhabditis 
elegans Genefinder; cDNA EST 
yk217b5.3 comes from this 
gene; cDNA EST yk217b5.5 
comes from this gene; cDNA 
EST yk340gi2.3 comes from 
this gene; cDNA EST 
yk340gl2.5 comes from this 
gene; cDNA EST yk428c5.5 
co... 



Laccaria bicolor 
glyoxal malate 
synthase protein 
mRNA, complete cds 1 



0.020 



1109847 



(U41538) No definition line 
found [Caenorhabditis eles 



P VALUE! 



Ie-12 



826 1 AFI 00694 



Mus musculus 
Pontin52 mRNA 
complete cds 



827 | AF093268 



Rattus norvegicus 
homer- 1c mRNA, 
complete cds 



0.019 



132836 



60S RIBOSOMAL PROTEIN 
L28 protein L28 [Rattus 
norvegicus] 



828 I AF10Q694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



0.019 



2633401 



(Z99 109) similar to DNA 
exonuclease 



0.019 



2492604 



MULTIDRUG RESISTANCE 
PROTEIN CDR2 albicans] 



829 I U67538 



830 



U56088 



Methanococcus 
jannaschii section 80 
of 150 of the 
co mple te aenome 



Human periodic 
tryptophan protein 2 
(PWP2) gene, exons 
3 to 14 



0.019 



1723566 



0.019 



2144804 



PU1A11VH 

GLUCOSYLTRANSFERASE 
CI7C9.07 

>gi| 13141 59|gnl|PID|e24 1 760 
(Z73099) SPAC17C9.07, 
putative glucosyl transferase Ien 
501, similar to 
SW:ALG8_ YEAST P4035 1 
glucosykransferase ALG8 
pombe] 



collagen alpha 1(11) chain 
bovine 



5.7 



4.5 



4.4 



2.7 



831 



U76524 



Sambucus nigra 
ribosome inactivating j 
protein precursor 
mRNA, complete cds 



0.018 



1916976 



(U9I682) vitelline membrane 
protein homolog [Aedes 
aegypti] 



7.2 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
TD 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















832 


AF026258 


Onobrychis viciifolia 
chalcone synthase 
(CHS) mRNA, 
complete cds 


0.018 


763076 


(Z48799) ZP3 (Cyprinus carpio] 
>gi|777724 (L41637) egg 
membrane protein [Cyprinus 
carpio] 


5.2 


833 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 


0.009 


! 3955011 


(AJ005438) beta adrenoreceptor 
B 


0.60 






C.jejuni VS1 DNA> 










834 


X71603 


emb|A39603|A39603 
Sequence 2 from 
Patent W094 17205 > 
::gb|I76090|I76090 
Sequence 2 from 
patent US 5691138 


0.008 


<NONE> 


<NONE> 


<NONE> 


835 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.008 ! 


138116 


HEAD FIBER PROTEIN 
(LATE PROTEIN GP8.5) 
>gi|75846|pir||WMBP8H gene 
8.5 protein - phage PZA 
>gi|2 16057 (Ml 1813) head 
fiber protein 


8.1 


836 


X91751 


Bovine herpesvirus 
type 1 UL7 gene 


0.008 


1711436 


SUPEROXIDE DISMUTASE 
(FE) i. 15.1.1) (Fe)- 
Pseudomonas aeruginosa 
>gi|409767 


5.9 


837 


M95594 


Arabidopsis thaliana 
1 -aminocyclopropane- 
1 -car boxy late 
synthase (ACS2) 
jjene, complete cds. 


0.008 


683698 


(Z48229) orf 1 gene product 
[Saccharomyces cerevisiae] 


le-06 


838 


U67465 


Methanococcus 
jannaschii section 7 
of 150 of the j 
complete genome 


0.008 


3874664 


(Z68493) predicted using 
Genefinder 


le-07 


839 


X72388 


B.taurus mRNA for 
filensin 


0.008 


100174 


1-aminocyclopropane-l- 
carboxylate synthase 


7e~09 


840 


U22398 


Human Cdk-inhibitor 
p57KIP2 (KIP2) 
mRNA, complete cds. 


0.008 


2228750 


(U93868) RNA polymerase III 
subunit [Homo sapiens] 


2e-lS 


841 


L42546 


Xenopus laevis LIM 
class homeodomain 
protein 


0.007 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) j 


SEQ 
ED 




DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










842 


AF041428 


ribosomal protein s4 
X isoform gene, 
complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


843 


ArUOOzz / 


Secale cereale omega 
secalin gene, 
complete cds 


u.uu / 


\-Jl^ lL^> 


<NONE> 


<NONE> 


844 


D 8 6254 


Human MHC (HLA) 
DRB intron 1 DNA, 
partial sequence 


0.007 


<NONE> 


<NONE> 


<NONE> 


845 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


846 


Y07738 


M.musculus gene for 
vimentin 


0.007 


<NONE> 


<NONE> 


<NONE> 


847 


AJ005813 


r\l tlDlUUpjla lilalldlia 

mRNA for 
neoxanthin cleavage 
enzyme 


0.007 


<NONE> 


<NONE> 


<NONE> 


848 


AF055119 


Homo sapiens alpha- 
tectorin (TECTA) 
gene, exon 6 


0.007 


<NONE> 


<NONE> 


<NONE> 


849 


Mo 1 lbO 


Zucchini 1- 

ami nocy clopropane- 1 - 

carboxylate synthase 


o on? 


<i«l> Ull 


<NONE> 


<NONE> 


850 


; Y 11050 


Homo sapiens DSG3 
gene, partial intron 
and partial exon 6, 
140 bp 


0.007 


<NONE> 


<NONE> 


<NONE> 


851 


X61204 


M.voltae vhuD, 
vhuG, vhuA, vhuU & 
vhuB senes 


0.007 


<NONE> 


<NONE> 


<NONE> 


852 


AB012105 


Brassica rapa mRNA 
forSLG45, complete 
cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


853 


S43882 


telomere: 

{ minichromosome, 
repeats } 

[Trypanosoma brucei. 
Genomic, 1 170 ntl 


0.007 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


oJbv^ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















854 


L32674 


Geomydoecus nadleri 
mitochondrial 
cytochrome oxidase I 
gene, partial cds. 


0.007 


<NONE> 


<NONE> 


<NONE> 


855 


U58732 


Caenorhabditis 
elegans cosmid 
F48D6. 


0.007 


<NONE> 


<NONE> 


<NONE> 


856 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007. 


<NONE> 


<NONE> 


<NONE> 


857 


Z35284 


H.sapiens mRNA for 
MDR3 P- 
glycoprotein 


0.007 


1730696 


HYPOTHETICAL 121.1 KD 
PROTEIN IN BI03-HXT17 
INTERGENIC REGION 
PRECURSOR YNR067c - yeast 
(Saccharomyces cerevisiae) 


9.5 


858 


X15217 


Human sno oncogene 
mRNA for snoA 
protein, ski-related 


0.007 : 


902455 


(U24203) membrane protein 
[Escherichia coli] 


8.8 


859 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.007 


1684636 


(Y09454) ORF3 [Lactobacillus 
casei bacteriophage A2] 


8.3 


860 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007 


3878803 


(Z48795) R05H5.7 
[Caenorhabditis elegans] 


8.3 


861 


S76317 


1 1V=I«U-2UU Kda I 
membrane protein 
scavenger receptor 
homolog {clone 18, 
intron and flanking 
exons 14 and 15} 
[sheep, lymph node, j 
lymphocytes, 
Genomic. 30 S nt, 
segment 2 of 2] 


0.007 


294747 


(L08174) ORF2 
Romanomermis culicivorax] 


7.4 



Ho4 



WO 01/02568 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


access ior» 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


p UAf t rr: 


862 


D88084 


Pedicuiaris 
verticillata 
chloroplast DNA, 
intergenic region 
between trnT(UGU) 
and trnL(UAA)5'exor 


i 0.007 


2555187 


(AF026789) vitellogenin 
[Pimpla nipponica] 


6.9 


863 


X58869 


Chicken mRNA for 

aldehyde 

dehydrogenase 


0.007 


115978 


CD30L RECEPTOR 
PRECURSOR 
(LYMPHOCYTE 
ACTIVATION ANTIGEN 


6.5 


864 


D87120 


Homo sapiens mRNA 
for GS3786, complete 
cds 


0.007 


3879589 


V^juu / j j i mum uui uuinaiiu — 
cDNA EST EMBL:D35637 
comes from this gene; cDNA 
EST yk322a3.5 comes from this 
gene; cDNA EST yk397b2.5 
comes from this gene; cDNA 
EST yk348bl 1.5 comes from 
this gene; cDNA EST 
yk397b2.3 comes fr... 
>gi|3880965|gnl|PID|el350578 
comes from this gene; cDNA 
EST yk322a3.5 comes from this 
gene; cDNA EST yk397b2.5 
comes from this gene; cDNA 
EST yk348bll.5 comes from 

Ulia ££CI1C, W i N /A CO 1 

yk397b2.3 comes ... 


5.1 


865 


X68793 


H.sapiens gene for 
antithrombin III 


0.007 


2358285 


(AF0 10403) ALR [Homo 
sapiens] j 


3.8 


866 


AJ001596 


Danio rerio mRNA 
for opioid receptor 
lomologue 


0.007 


2507509 


Hypothetical 29.8 kd 

PROTEIN IN HOLB-PTSG 
INTERGENIC REGION 
>gi| 1787342 (AE000210) orf, 
hypothetical protein 
^Escherichia coli] protein in 
holB 3'region . [Escherichia 

20\'i] 


1.9 


867 


( 

AF061L95 ( 


Streptomyces albus 
/aline dehydrogenase 
Vdh) gene, complete 
:ds 


0.007 


( 
I 

208S76S t 


[AF003145) B0414.S gene 
product [Caenorhabdltis 
^leuans] 


1.9 


868 


i 
r 
r 

AJ005813 t 


\rabidopsis thaliana 
nRNA for 
leoxanthin cleavage 
-nzyme 


0.007 


I 

I 

i 

1710105 


JDP-N- 

\CETYLGLUCOS AMINE 2- 
EPIM ERASE UDP-N- 
icetylglucosamine 2-epimerase 
Plasmid pWQ799] 


1.7 



WO 01/02568 
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! Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Zebrafish retinoic 










869 


L03398 


acid receptor alpha 
2.A 


0.007 


2239219 


(Z97210) hypothetical protein 


0.77 


870 


D63484 


Human mRNA for 
KIAA0150 gene, 
partial cds 


0.007 


19917 


(Z14014) Pistil extensin like 
protein, partial CDS only 


0.61 


871 


M31483 


Maize giyceraldehyde 
3-phosphate 
dehydrogenase, 3' 
end. 


0.007 


543068 


mucin, tracheobronchial - dog 
>gi|402558 


0.45 


872 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


0.007 


2494941 


ALPHA-2B ADRENERGIC 

RECEPTOR adrenoceptor 

[Cavia porcellus] 

>gi| 1587 159|prf]|2206293B 

adrenoceptor alpha2B [Cavia 

porcellus] 


0.42 


873 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


0.007 


1110587 


(S79410) nuclear localization ° 
signals Peptide, 140 aa] [Mus 
sp.] 


0.26 


874 


X88931 


H.sapiens PAL2A 
gene 


0.007 


1706176 


CUTINASE TRANSCRIPTION 
FACTOR 1 ALPHA 
>gi|1262912 (U51671) cutinase 
transcription factor 1 [Fusarium 
solani f. sp. pisi] 


0.21 


875 


S74155 


zRAR alpha =retinoic 
acid receptor alpha 
zebrafish, embryos, 
mRNA, 1773 nt] 


0.007 


2239219 


(Z97210) hypothetical protein 


0.11 


876 


M74193 


Petrornyzon marinus 
plasma albumin 
mRNA, complete cds. 


0.007 


73088S 


OCTAPEPTIDE- REPEAT 
PROTEIN T2 


0.011 


877 


U03673 « 


Saccharomyces 
cerevisiae Spp41p 
[SPP41) gene, 
complete cds. 


0.007 


3820885 


;AL033126) 65G3.k 
'Drosophila melanosaster] 


0.001 


878 


] 
1 

D37766 c 


rlomo sapiens mRNA 
'or Laminin-5 beta3 
:hain, complete cds 


0.007 


( 

1235974 


X96713) collagen [Globodera 
pallidal 


3e-06 



WO 01/02568 
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Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins ) 


SEQ 
ID 


ACCESSION* 


f DESCRIPTION 


P VAT TIP 


AeuhSSION 


DESCRIPTION 


P VALUE 






Caenorhabditis 










879 


AF022388 


elegans putative 
transcription factor 
MAB-3 (mab-3) 
gene, complete cds 


0.007 


3747107 


(AF095741) unknown [Rattus 
norvesicus] 


5e-09 


880 


U89984 


Acanthamoeba 
castellanii 
transformation- 
sensitive protein 
homo log mRNA, 
complete cds 


0.007 


1890281 


(U89984) transformation- 
sensitive protein hornolos 


2e-09 


881 


AB020689 


Hnmn cinipnc mDNTA 
flUIIlU CxiJJlClhS 1 1 llvl > t\ 

for KIAA0882 
protein, partial cds j 


0.007 


3880809 


rabGAP domains; cDNA EST 
EMBL:D34945 comes from this 
gene; cDNA EST 
EMBL:D27313 comes from this 
gene; cDNA EST 
EMBL:D34829 comes from this 

EMBL:D27312 comes from this 
gene; cDNA ... Probable 
raooAi^ domains; cUNA EST 
EMBL:D34945 comes from this 
gene; cDNA EST 
EMBL:D27313 comes from this 
gene; cDNA EST 
EMBL:D34829 comes from this 
gene; cDNA EST 
EMBL:D27312 comes from this 
gene; cDNA ... 


le-23 


882 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.006 


<NONE> 


<rTsTnTsTF^ 




883 


< 

AF027173 c 


Arabidopsis thaliana 
:el!ulose synthase 
:atalytic subunit (Ath- 
\) mRNA, complete 

:ds 


0.006 


<NONE> 


<NONE> 


<NONE> 


884 


r 
F 

U76524 r 


Sambucus nigra 
ibosome inactivating 
)rotein precursor 
nRNA, complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSION* 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















885 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.006 


I <NONE> 


<NONE> 


<NONE> 


886 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


0.006 


\ <NONE> 


<NONE> 


<NONE> 


887 


ABO 12 106 


UldoMCd rd.p<4 I ILtvli r\ 

for SRK45, complete 
cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


888 


M80529 


Rattus norvegicus 
ceruloplasmin gene, 
exon 1 and 5' flank 


0.006 


<NONE> 


<NONE> 


<NONE> 


889 


AF027173 


auitaUpMa Ultill<lflil 

cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.006 


99408 


hypothetical protein 6 - 
Chlamydomonas reinhardtii 
transposon 

>gi| 1 3607 17|gnl|PID|e3346 1 
reinhardtii] 


9.6 


890 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


0.006 


4039024 


(AF039110) polyprotein 
[Rubella virus] 


9.3 


891 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.006 


160533 


(M9442S) merozoite surface 
antigen 1 [Plasmodium vivax] 


7.5 


892 


AB012106 


jjidbbicu. rapa [Tii\.rN/\ 
for SRK45, complete 
cds 


0.006 


4019458 


(AF093984) envelope 
glycoprotein [Human 
immunodeficiency virus type 1] 


7.0 


893 


AJ005813 


r\l UUIUUJJMI) lilallUrid 

mRNA for 
neoxanthin cleavage 
enzyme 


0.006 


1916976 


(U91682) vitelline membrane 
jiuiciu iiurnoiog [/\eaes 
aegypti] 


6.8 


894 


] 
I 

AF093268 c 


Rattus norvegicus 
lomer-lc mRNA, 
romplete cds 


0.006 


( 
( 

j 

102059 r 


promastigote surface antigen-2 
clone 4.6) - Leishmania major 
fragment) >gi|9583 (X57135) 
»urface antigen P2 [Leishmania 
najor] 


2.4 


895 


I 
\ 

AF093268 c 


Rattus norvegicus 
lomer-lc mRNA, 
v omplete cds 


0.006 


( 

3171241 I 


AF067204) transcription factor 
3F- 1 [Danio rerio] 


1.0 


896 


r 

X993S4 t 


vl.musculus mRNA 
or paladin gene 


0.003 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank^ 



SEQ 

ID I ACCESSION 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



897 | AFQ27174 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 



0.003 



<NONE> 



<NONE> 



<NONE> 



898 I AE 00 1148 



Borreiia burgdorferi 
(section 34 of 70) of 
the complete genome 



0.003 



4160388 



899 I AF027173 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
cds 



0.003 



1709213 



(AJ0 11856) ORF Q0255 
[Saccharomyces cerevisiae] 



7.6 



NUCLEAR ENVELOPE PORE 
MEMBRANE PROTEIN POM 
121 (PORE MEMBRANE 
PROTEIN OF 121 KD) (P145) 



900 1 U72396 



Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA, complete cds 



0.002 



<NONE> 



<NONE> 



901 I AF1 00694 



Mus rnuscuius 
Pontin52 mRNA, 
complete cds 



902 | AF 104631 



903 1 AF 100694 



904 1 ABO 12 106 



Chiamydomonas 
reinhardtii light 
harvesting complex II 
protein precursor 
(Lhcb3) mRNA, 
complete cds 



0.002 



<NONE> 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Brass ica rapa mRNA 
for SRK45, complete 
cds 



0.002 



<NONE> 



0.002 



<NONE> 



0.002 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



905 | M21339 



Human non-histone 
chromosomal protein 
HMG-14 gene, 
complete cds. 



0.002 



<NONE> 



<NONE> 



<NONE> 



906 I AF0 12899 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 



0.002 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SE( 
ID 


1 Neares 

3 

I ACCESSIOI 


t Neighbor (BlastN vs. 
V DESCRIPTION 


Genhnnk^ 
P VALUE 


f Nearest Neisr 
1 ACCESSION 


tbor (BlastX vs. Non-Redundant F 
DESCRIPTION 


>roteins) 

P value! 


1 90" 


[ 1 X57103 


_ Human h-lys gene fo 
lysozyme (upstream 
_ region) 


r 

0.002 


I <NONE> 


<NONE> 


<NONE> 


903 


J AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.002 


1 <NONE> 


<NONE> 


<NONE> 1 


909 


j U01066 


Human CD4 
promoter, partial 
sequence. 


0.002 


| <NONE> 


<NONE> 




1 910 


1 L28094 


Barley mRNA 
sequence. 


0.002 


1 <NONE> 


<NONE> 


<NONE>| 
<NONE>| 


911 


1 AD000833 


Homo sapiens DNA 
from chromosome 19 
cosmid f 19399 (-17 
kb EcoRI restriction 
fragment) 


0.002 


<NONE> 


<NONE> 


<NON^> 


912 


1 AJ011701 


Homo sapiens TRHR 
gene promoter and 
exons 1-2, partial 


0.002 


i <NONE> 


<NONE> 


<NONE> 1 


913 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.002 


<NONE> 


<NONE> 


<NONE> J 


914 


AF037062 


Homo sapiens retinol 
dehydrogenase gene, 
complete cds 


0.002 


<NONE> 


<NONE> 


<NONE> J 


915 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.002 1 


<NONE> 


<NONE> 


<NONE> 1 


L 916 | 


J 
t 

U67608 c 


Methanococcus 
annaschii section 150 
3f 150 of the 
:omplete jenome 


0.002 1 


<NONE> 


<NONE> 


<NONE> 1 


917 1 


i 
c 
c 

/ 

AF027173 c 


\rabidopsis thaliana 
ellulose synthase 
atalytic subunit (Ath- 
K) mRNA, complete 
ds 


0.002 1 


<NONE> 


<NONE> 


<NONE> 1 


918 J 


I 

r« 

Z46736 C 


I.sapiens DNA for 
speat region (ABM- 

:82) 


0.002 I 


<NONE> 


<NONE> 


<NONE> 1 


919 1 


B 

ft 

AB012106 a 


rassica rapa mRNA 
)r SRK45. complete 
is 


0.002 J 


<NONE> 


<NONE> 


cNONE> J 



Y/0 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






X.Iaevis mRNA for 










920 


Z85983 


NOVA protein 


0.002 


<NONE> 


<NONE> 


<NONE> 


921 


AF027 173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


922 


S61Q77 


medium-chain acyl- 
CoA dehydrogenase 
{exon 10, intron 10} 
[human, Genomic, 
1407 nt] 


0.002 


<NONE> 


<NONE> 


<NONE> 


923 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


0.002 


<NONE> 


<NONE> 


<NONE> 


924 


AB012105 


Brass ica rapa mRNA 
for SLG45 complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


925 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


926 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


927 


. X51646 


H.sapiens DNA for 
dopamine D2 
receptor gene 


0.002 


3329125 


(AE001337) Yop C/Gen 
Secretion Protein D [Chlamydia 
trachomatis] 


9.5 


928 


f 

AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.002 


465762 


HVFUlHbllCAL 112.1 jsoj 
PROTEIN C06G4.1 IN 
CHROMOSOME III 
>gi|630524|pir||S44748 
C06G4.1 protein - 
Caenorhabditis elegans 
>gi|409292 (L2559S) homology 
with vigilin; coded for by C. 
elegans cDNA 

GenBank:M88954 (CEL12C9); 
putative [Caenorhabditis 


8.9 


929 


U4S47S 


Human skeletal 
muscle ryanodine 
receptor gene 


0.002 


2137221 


co-repressor protein - mouse 
>2i|6426l9 


6.9 



mi 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 1 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










930 


AF 100694 


Pontin52 mRNA, 
complete cds 


0.002 


806536 


(Z22520) membrane protein 
[Bacillus acidopullulyticusl 


6.3 j 


931 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.002 


3881055 


(AL023844) Y48A6B.1 
[Caenorhabditis elegans] 


5.8 


932 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


0.002 


3878330 


(Z81097) K07A1.4 
[Caenorhabditis elegans] 


4.8 


933 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.002 


137640 


REPLICATION PROTEIN El 
papillomavirus 


4.0 


934 


AFO 19660 


Mus musculus 1 
nuclear orphan 
receptor RORgamma 


0.002 


1330365 


(U58757) similar to nucleotide 
pyrophosphatases 


3.9 


935 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


0.002 


1785972 


(U46951) ORF5; Method: 
conceptual translation supplied 
by author 


3.7 


936 


V00508 


Human gene for 
epsilon-globin. 


0.002 


1333804 


(X56082) protease 
[Ruminococcus flavefaciens] 


3.5 


937 


AB012105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.002 


4153876 


(AC00553 1) similar to mouse 
homeodomain-interacting 
protein kinase 2; similar to 
AF077659 (PID:g3702958) 


3.0 


938 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


0.002 


1070461 


ornithine earbamoyltransferase 
(EC 2. 1.3.3) -yeast 
(Saccharomyces cerevisiae) 
>gi|929866 (X83502) 
pid:e 130025 [Saccharomyces 
cerevisiae] >gi| 1008256 


2.8 


939 


S41458 


rod cGMP 
phosphodiesterase 
beta-subunit [human, 
mRNA, 3231 nt] 


0.002 


3450883 


(AF083334) fibroin [Antheraea 
pernyi] 


1.6 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Drosophila 










940 


X06286 


melanogaster Gart 
locus with genes for 
GARS=phosphoribos 
ylamineglycine 
ligase, 

AERS=phosphoribosy 
Iformylglycinamidine 
eye lo- ligase, 
GART=glycinamide 
ribotide 

transformylase > :: 
gb|J02527|DROGAR 
T D. melanogaster 
Gart gene encoding 
two polypeptides with 
GAR synthase, AIR 
synthase, and GAR 
transformylase 

Pfi7vmf» nptivifipQ nnrl 

a pupal cuticle gene 
nested within intron | 
A of the Gart gene. 


0.002 


2662054 


(AB004651) isocitrate lyase 


1.5 


941 


AF015812 


Mnmn >»anipn^ RNA 

helicase p68 
(HUMP6S) gene, 
complete cds 


0.002 


3641659 


(AB008374) alpha 3 type I 
collagen 


1.1 


942 


X78925 


H. sapiens HZF2 
mRNA for zinc finger 
protein 


0.002 


141624 


ZINC FINGER PROTEIN ZFP- 
37 (MALE GERM CELL 
SPECIFIC ZINC FINGER 
PROTEIN) 


1.0 


943 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.002 


3879997 


f 74907 H weak similaritv with 
mu-type opioid receptor (Swiss 
Prot accession number (P33535) 


1.0 


944 


Z69639 


Human DNA 
sequence from 
cosmid L241B9, 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains 
polymorphic VNTR 
pYNZ32. 


0.002 


3523162 


(AF076292) TGF-beta/activin 
signal transducer FAST-lp 


0.81 



H(2> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Protein.O 


SEQ 
ID 




UcoLKlr 1 iKJ\>* 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















945 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.002 


2984161 


(AE000761) hypothetical 
protein [Aquifex aeolicus] 


! 0.80 


946 


AF093268 


Rattus norvegicus 
homer- Ic mRNA, 
complete cds 


0.002 


101830 


hypothetical protein B - chestnut 
blight fungus 


0.72 


947 


AFO 17307 


Homo sapiens Ets- 
related transcription 
factor (ERT) mRNA, 
complete cds 


0.002 


200531 


(M18071) prion protein [Mus 
musculus] 


0.72 


948 


U11383 


Drosophila 
melanogaster Ovo- 
1028aa (ovo) mRNA, 
complete cds. 


0.002 


I 2465207 


(AFO 16045) OVO-like 1 
binding protein [Homo sapiens] 


0.35 


949 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.002 


3834294 


(U80846) No definition line 
found [Caenorhabditis elegans] 


0.29 


950 


AF086315 


Homo sapiens full 
length insert cDNA 
clone ZD52F10 


0.002 


545067 


(S683^6) action potential 
broadening potassium 
channel=Shab [Aplysia, bag cell 
neurons, head ganglia, Peptide, 
905 aa] [Aplysia] 
>gi|743110|prf||2011375A K 
channel [Aplysia californica] 


0.15 


951 


X53096 


S. aureus genes 
encoding Sau96I j 
DNA 

methyl transferase and 
Sau96I restriction 
endonuclease 


0.002 


2529575 


(AF018164) kinesin-like protein 
3C [Homo sapiens] 


0.11 


952 


AB012105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.002 


729918 


LA PROTEIN HOMOLOG (LA 
RIBONUCLEOPROTEIN) (LA 
AUTO ANTIGEN HOMOLOG) 


0.092 


953 


X73973 


G.gallus RAR- 
gamma2 mRNA for 
retinoic acid receptor 


0.002 


586122 


TRICHOHYALIN 
>gi|423321|pir!|A40691 
trichohyalin - sheep >gi|295941 
(Z 18361) trichohyalin 


0.073 


954 


S41458 


rod cGMP 
phosphodiesterase 
3eta-subunit [human, 
mRNA, 3231 nt] 


0.002 


( 

1017427 j 


;X90569) elastic titin [Homo 
sapiens] 


0.013 
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SEQ 
ID 


Nearest 
ACCESSION 


Neighbor (BlastN vs. ( 
f DESCRIPTION 


lien bank) 
| P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


955 


M35887 


D.melanogaster 
defective chorion- 1 
fcl25 (dec-l)gene, 
complete cds. 


0.002 


1825606 


(U88169) similar to 
molybdoterin biosynthesis 
MOEB proteins [Caenorhabditis 
elegans] 


0.008 


956 


AF034099 


Laccaria bicolor 
glyoxal malate 
synthase protein 
mRNA, complete cds 


0.002 


1825593 


(U88167) D2092.2 gene producl 
[Caenorhabditis eleeans] 


le-06 


957 


AF033929 


Bactrocera dorsalis 
strain Tahiti 
mitochondrial D-loop 
region, complete 
sequence 


9e-04 


<NONE> 


<NONE> 


<NONE> 


958 


ABO 12 106 


Brass ica rapa mRNA 
for SRK45, complete 
cds 


8e-04 


<NONE> 


<NONE> 


<NONE> 


959 


AF029062 


Homo sapiens DEAD- 
box protein (BAT1) 
gene, partial cds 


8e-04 


<NONE> 


<NONE> 


<NONE> 


960 


U70671 


Human ataxin-2 
related protein 
mRNA, partial cds 


8e-04 


<NONE> 


<NONE> 


<NONE> 


961 


AF051709 


Dendrocopos 
Ieucopterus clone 2 
microsatellite HrU2 
repeat region 


8e-04 


<NONE> 


<NONE> 


<NONE> 


962 


XI 4077 


Pea phy gene for 

phytochrome 

apoprotein 


8e-04 


<NONE> 


<NONE> 


<NONE> 


963 


AC004497 


Homo sapiens 
chromosome 21, PI 
clone LBNL#6 


8e-04 


457146 


(L27838) rhoptry protein 
"Plasmodium yoelii] 


9.6 


yt>4 


AF077344 


Homo sapiens 
cartilage-derived C- 
type lectin 


8e-04 


3702123 


;AJ011707) TraD protein 
^Escherichia coli] 


8.5 


965 


3 

X85117 t 


H. sapiens epb72 gene 
ixons 2,3.4,5,6,7 


8e-04 


( 

2570059 f 


;AJ004687) N-4 cytosine- 
JpecificmMethyltransferase 
Neisseria gonorrhoeae] 


6.8 


966 


1 
I 

AF 100694 c 


vfus musculus 
>ontin52 mRNA, 
omplete cds 


8e-04 


( 
I 

F 

c 

1345359 [ 


:OPPER TRANSPORT 
PROTEIN CTR1 transport 
?rotein - yeast (Saccharomyces 
•erevisiae) gene product 
Saccharomyces cerevisiae] 


6.7 



^(5 
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S Nearest 


Neighbor (BlastN vs. Genbank) 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEQ 
ID 


ACCESSION 


[ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










967 


AF031403 


MLL/AF4 
translocation 
breakpoint 
t(4;ll)(q21;23) 


8e-04 


2498926 


SMALL PROTEIN B 
HOMOLOG A43259, from E. 
hirae [Mycoplasma 
pneumoniae] 


6.6 


968 


L29252 


Human (clone D13-2; 
L-iditol-2 

dehydrogenase gene, 
exon 4, exon 5, exon 
6 and exon 7. 


> 

8e-04 


1488070 


(U63997) putative transposase 
Enterococcus faecium] 


5.2 


969 


XI 6995 


Mouse N10 gene for 
a nuclear hormonal 
binding receptor 


\ oe-04 


1493833 


(U47323) stromal cell protein 
[Mus musculus] 


3,2 


970 


M99412 


Human interIeukin-8 
receptor (IL8RB) 
gene, complete cds 


8e-04 


1346101 


4- AMINOB UT YR ATE 
AMINOTRANSFERASE 
TRANSAMINASE) (GAB A 
AMIINU 1 KAINorLRASE) 
homolog - smut fungus 
(Ustilago maydis) >gi|881562 
Emericella nidulans gamma- 
ami no-n-butyrate transaminase 
Swiss-Prot Accession Number 
P14010 [Ustilago maydis] 


0.83 


971 


U37452 


Human Down 
Syndrome region of 
chromosome 2 1 
gcnoiinL bccjucriLc, 
clone A31D6-1C5. 


8e-04 


4164069 


(AF11 1093) latrophilin 3 splice 
variant bbah [Bos taurus] 


0.26 


972 


] 
] 

AF 100694 c 


VIus musculus 
D ontin52 mRNA, 
;omplete cds 


8e-04 


] 

< 

1352877 [ 


H yPUlRLULAL 13.0 KU 
PROTEIN IN RAD26-GEF1 
1NTERGENIC REGION 
>gi| 107788 l|pir||S57057 
Drobable membrane protein 
yjR038c - yeast 
'Saccharomyces cerevisiae) 
>gi|1015688 (Z49538) ORE 
*7R038c putative 
Saccharomyces cerevisiae] 


0.23 


973 


I 

\ 

AF093268 c 


^attus norvegicus 
lomer-lc mRNA, 
•omplete cds 


8e-04 


( 

1788557 r. 


AE000312) orf, hypothetical 
>rotein [Escherichia coli] 


0.19 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















974 


X83872 


il. Vulgaris IILTvl i r\ IKJI 

cAMP response 
element binding 
protein 


8e-04 


1175386 


HYPOTHETICAL 37 7 KD 
PROTEIN CI 8B 11.06 IN 
CHROMOSOME I 
>gi|2l30289|pir||S58305 
fiypoineucui pruicin 
SPAC 1 8B 1 1.06 - fission yeast 
hypothetical protein 
[Schizosaccharomyces pombe] 


0.005 


975 


M32514 


Rat simple sequence 
DNA, cloneS. 


8e-04 


2394492 


(AF024502) No definition line 
found [Caenorhabditis elegans] 


0.002 


976 


AF074386 


oaiuDUCus nigra 
hevein-like protein 
mRNA, complete cds 


8e-04 


2981631 


(AB012223) ORF2 [Canis 
familiaris] 


0.001 


977 




H.sapiens DNA for 
endogenous retroviral 
like element 


oe-U4 


ZVojI 1U 


(Y 127 13) Pro-Pol-dUTPase 
polyprotein 


3e-U4 


978 


T ti /no 1 


Human myosin-IC 
mRNA, complete cds. 


oe-U4 




(AC0024I1) Strong similarity to 
myosin heavy chain gb|Z34293 
from A. thaliana. [Arabidopsis 
thaliana] 




979 




Drosophila 
melanogaster dead- 
uo a proicin 
D.melanogaster 
DEAD-box gene, 
complete L-Do 


oe-U4 


17*7 Ami 


(AJ010475) RNA helicase 
[Arabidopsis thaliana] 




980 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


981 




Mus musculus 
Pontin52 mRNA, 

t-VJUlJJlClC tui 


7e-04 






<NONE> 


982 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


983 


Z739S7 


Human DNA 
sequence from 
cosmid N120B6 on 
chromosome 22 
Contains ESTs, 
complete sequence 
IHomo sapiens] 


7e-04 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


| Nearest Neighbor (BlastX vs. Non-Redundant Proteins) \ 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


! ACCESSION 


DESCRIPTION 


P VALUE 






Brassica rapa mRNA 










984 


AB012106 


for SRK45, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


985 


AF093268 


Rattus norvegicus 
homer- 1 c mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


986 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


987 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


988 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


7e-04 


<NONE> 


<NONE> 


<NONE> 


989 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


990 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


991 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


992 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


993 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


994 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


7e-04 


3327230 


(ABO 14608) KIAA0708 protein 
[Homo sapiens] 


9.5 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















995 


U /OjZ4 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


/e-LW 


jjZJ zjU 


(ABO 14608) KIAA0708 protein 
[Homo sapiens] 




996 


rvTU !*¥ jo / 


Sambucus nigra 
hevein-Iike protein 
mRNA. complete cds 


/ e-u** 


Jo / 0*00 


(Z93380) predicted using 
Genefinder; similar to 7tm 
receptor protein [Caenorhabditis 
elegans] 


7 t 


997 


U / OJZ4 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


/e-U4 


1 1 7Q77 1 


hypothetical protein MJ1293 - 
Methanococcus jannaschii 
>gi|1591931 (U67570) M. 
jannaschii predicted coding 
region MJ1293 [Methanococcus 
jannaschii] 


O.J. 


998 


U09412 


Human zinc finger 
protein ZNF134 
mRNA, complete cds 


7e-04 


1083336 


glutathione transferase (EC 
2.5.1.18) piA - mouse 


5.4 


999 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic suounii ^M.in- 
A) mRNA, complete 
cds 


7e-04 


473515 


^ ivi i / o i y j is AJvii 
dehydrogenase subunit ND4 
[Asterina pectin ifera] 


3.7 


1000 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


7e-04 


1724097 


(U79772) female sex protein 
[Mercurial is annua] 


3.3 


1001 


AF100694 


Mus rnusculus 
Pontin52 mRNA, 
complete cds 


7e-04 


1197103 


(D49747) core, env, and part of 
E2/NS1 


3.2 


1002 


X16995 


Mouse N10 gene for 
a nuclear hormonal 
binding receptor 


7e-04 


345372 


unco protein, long rorm - 
Caenorhabditis elegans 
>gi|258529|bbs|l 18648 
(S47168) UNC- 
5=imrnunoglobulin and 
thrombospondin type 1 
transmembrane protein 
{alternatively spliced} aa] 
[Caenorhabditis elegans] 
>gi|2662596 (AF036698) C. 
elegans UNC-5 (NID:g25852) 


2.7 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1003 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


7e-04 


4204220 


(AB022866) mobilization 
£rotein 


2.5 


1004 


rviT \J J \j o 


Rattus norvegicus 
homer- lc mRNA, 


7e-04 


3201550 


(Y 17 1 16) fibrinogen-binding 
nrotein 

yj I Wiv ill 


2.4 


1005 


AF074386 


Sambucus nigra 
hevein-lilce nrotein 
mRNA, complete cds 


7e-04 


• 1174264 


(U45966) polyprotein [Hepatitis 
G virus] 


0/73 I 


1006 


AF027173 


Arabidopsis thaliana 
cellulose svnthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-04 


135308 


TRANSCRIPTION FACTOR 
JUN-D 


0.065 


1007 


X98745 


H snnierK FWS oene 
intron 6, 
polymorphism 


7e-04 


728836 


! ! ! ! ALU SUBFAMILY SP 
WARNING ENTRY 


0.001 


1008 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


7e-04 


1633564 


(U47924) C8 [Homo sapiens] 


9e-09 


1009 


AT \J 1 tj O U 


Sambucus nigra 
hevein-like protein 


6e-04 


284171 


Ig epsilon chain C region form 3 

_ hurnnn 

1 1 Li 1 1 1U1 1 


1.3 


1010 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


6e-04 


3845262 


(AE001414) BRAHMA 
ortholog (DNA helicase 
superfamily II) 


0.25 


1011 


AL034404 


Human DMA 
sequence from clone 
417C12on 
chromosome Xp22. 1 1 
22.2, complete 
sequence [Homo 
sapiens] 


3e-04 


<NONE> 


<NONE> 


<NONE> 


1012 


M99701 


Homo sapiens (pp21) 
mRNA, complete cds. 


3e-04 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


[ Nearest Neighbor (BlastX vs. Non-Redundant Proteins ) 


SEQ 
ID 


ACCESSION 


[ DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


















U00227 


Ovis aries Merino 
breed DR beta-chain 
antigen binding 
domain, MHC class I] 
DRB (Ovar-DRB24) 
gene, partial cds. 


I 

3e-04 


<NONE> 


<NONE> 


<NONE> 


1014 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


3e-04 


<NONE> 


<NONE> 


<NONE> 


1015 


U95102 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 


3e-04 


999418 


(L 19655) ORF [Tomato 
ringspot virus] 


8.3 


1016 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


3e-04 


2367460 


(AF0 11415) putative 
pheromone receptor [Mus 
musculus] 


7.0 


10L7 


AJ010737 


Mus musculus DNA 
for microsatellite 3kb 
upstream Ibp gene 


3e-04 


4106549 


(AF104411) neuronal-specific 
septin 3 [Mus musculus] 


5.5 


1018 


AF053137 


Homo sapiens histone 
deacetylase 3 gene, 
exons 4, 5, 6, 7, 8, 9, 
and 10 


3e-04 


416702 


NADH-DEPENDENT FLAVIN 
OXIDOREDUCTASE acid- 
inducible - Eubacterium sp 
>gi|1381570 (U57489) 
NADH.flavin oxidoreductase 
[Eubacterium sp. VPI 12708] 


5.3 


1019 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


3e-04 


1785789 


(Y08502) orfl 1 Id [Arabidopsis 
thaliana] 


5.1 


1020 


AC004173 


-lorno sapiens clone 
UWGC:y23x011 
from 6p2 1, complete 
sequence [Homo 
sapiens] 1 


3e-04 


558521 


(D28917) polyprotein [Hepatitis 
C virus] 


1.1 


1021 


X57025 


Human IGF-I mRNA 
for insulin-like 
growth factor I 


3e-04 


< 

4206707 i 


[AFL 18122) putative outer 
nembrane protein OmpU 


0.65 


1022 


] 

X77090 


H. sapiens IL-IRa 
iene. 


3e-04 


( 

1065941 [ 


U40799) F42C5.7 gene product 
Caenorhabditis elegans] 


0.12 j 
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Nearest Neighbor (BlastN vs. Genbank) 


I Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 1 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Pseudorabies virus 










1023 


M34651 


with upstream and 

downsteam 

sequences. 


3e-04 


2746853 


(AF040650) contains similarity 
to sodium-potassium-chloride 
cotransport proteins 


7e-05 


1024 


Z36011 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBRI42w 


3e-04 


2500537 


PROBABLE ATP- 
DEPENDENT RNA 
HELICASE HAS 1 
>gi|626265|pir||S47451 
hypothetical protein YMR290c 
RNA helicase [Saccharomyces 
cerevisiae] 


4e-08 


1025 


AF020286 


Dictyostelium 
discoideum 2034 
gene, partial cds 


3e-04 


1465834 


(U64857) No definition line 
found [Caenorhabditis elegans] 


6e-14 


1026 


L26049 


Chlamydomonas 
reinhardtii dynein 
heavy chain alpha 
(ODA1 1) gene, exons 
2-15, and partial cds. 


3e-04 


3876775 


(Z81077) predicted using 
Genefinder; Similarity to Yeast 
protein 8248 (TR:G587531) 


9e-15 


1027 


AF020286 


Dictyostelium 
discoideum 2034 
gene, partial cds 


3e-04 


1465834 


(U64857) No definition line 
found [Caenorhabditis elegans] 


le-17 


1028 


X79811 


S.cerevisiae ACT3 
gene 


3e-04 


3876090 


(Zt>yt»J^j Similarity to Yeast 
uridine kinase 

(SW:URK1_YEAST); cDNA 
EST EMBL:Z14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209hl.5 
comes from this ge... 


7e-3 1 


1029 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1030 


M22970 


Human pancreatic 
phospholipase A-2 
(PLA-2) gene, exons 
1 to 3. 


2e-04 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1031 


Z68686 


sequence from 
cosmid N2E9 on 
chromosome 22. 
Contains EST, 

[Homo sapiens] 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1032 


! X95154 


H.sapiens brca2 gene 
exon 4 > ' 

emb|A62779|A62779 
Sequence 20 from 
Patent WO9719110 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1033 


AJ005813 


Arabidopsis thaiiana 
mRNAfor 

IlCUAclIiUllIl LlCdVtigC 

enzvme 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1034 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1035 


AE001415 


Plasmodium 
falciparum 
chromosome 2, 
section 52 of 73 of 
the complete 
sequence 


2e-04 ! 


<NONE> 


<NONE> 


<NONE> 


1036 


AF090115 


Lycopersicon 
esculentum cytosolic 

r"lnss TT smnll hpnt 

wiUOJ XX JlllUll 

shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


2e-04 | 


<NONE> 


<NONE> 


<NONE> 


1037 


AC000958 


Homo sapiens 
(subclone 6_d9 from 
PI H21) DNA 
sequence 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1038 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


2e-04 


2501523 


CD59 GLYCOPROTEIN 
PRECURSOR 


7.1 


1039 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-04 


2765360 


(Y 13925) cathepsin L2 [Penaeus 
vannameil 


6.S 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs, Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












RNA POLYMtkASE 




L040 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


2e-04 


133636 


>gi|67126|pir||RRXPLC RNA- 
directed RNA polymerase (EC 
2.7.7.48) - lymphocytic 
choriomeningitis virus (strain 
Armstrong 53b) >gi|331369 


5.2 


1041 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


2e-04 


3822155 


(AF074613) type II secretion 
protein [Escherichia coli 
0157:H7] 


4.0 


1042 


U76S24 


Sambucus nigra 
ribosome inactivating 
protein precursor 

mRNA fnmnlptf* Pfta 


2e-04 


1718125 


REGULATORY PROTEIN E2 

>oi 11 020299 tvne 161 


0.38 


1043 




Sus scrofa mRNA for 
glucose transporte 

UI UlCll I 








9e- 1 "S 


1044 


AF008216 


Homo sapiens 
candidate tumor 
suppressor pp32rl 


le-04 


<NONE> 


<NONE> 


<NONE> 


1045 


X98890 


S. tuberosum mRNA 
for inorganic 
nhosDhate 
transporter, StFTl 


le-04 


624126 


(U42580) a65L [Paramecium 
bursaria Chlorella virus 1] 


7.9 ! 


1046 


L14930 


Glycine max (Rab7p) 

nii\iirv, v_ Kji 1 1 yj \ ^ ii» ^uo. 


9e-05 


<NONF> 


<NONE> 


<NONE> 


1047 


AJ009970 


Mus musculus 
thromboxane A2 
receptor gene, exon 3, 
partial 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1048 


Y11896 


M.musculus mRNA 
for Brx gene, partial 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1049 


L10832 


Polistes annularis 
(clone pan48AAT) 
tandem repeat region. 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1050 


AF055011 


Homo sapiens clone 
24587 mRNA j 
sequence 


9e-05 


3880586 


(Z/9/58) cUNAhSl 
EMBL:D28009 comes from this 
gene; cDNA EST 
EMBL:D28008 comes from this 
gene; cDNA EST 
EMBL:D32478 comes from this 
gene; cDNA EST 
EMBL:D34508 comes from this 
gene; cDNA EST 
EMBL:D37581 comes from this 
gene; ... 


7.6 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1051 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


9e-05 


3024292 


RHODOPSIN >gi|2290717 
(AF000947) rhodopsin [Sepia 
officinalis] 


6.7 


1052 


Z58294 


H.sapiens CpG DNA, 
clone 34d6, forward 
read cpg34d6.ftla . 


9e-05 


3885496 


(AF064825) heparinyheparan 
sulfate N-acetylglucosaminyl N- 
deacetylase/N-suIfotransferase 
[Bos taurus] 


0.65 


1053 


D87451 


Human mRNA for 
KIAA0262 gene, 
complete cds 


9e-05 


" 3874739 


(Z66495) similar to claustrin 
like 


0.004 


1054 


L37092 


Mus musculus cyclin- 
dependent kinase 
homologue 


9e-05 


3080513 


(AL022598) hypothetical 
protein 


4e-09 


1055 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1056 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1057 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1058 


D10102 i 


Homo sapiens DNA 
from cosmid 
clone: 844, GT repeat 
sequence 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1059 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17. 6 
mRNA, complete cds 


8e-05 


1176475 


H V PUTHh'IICAL 8U.4 RD 
PROTEIN IN SMC3-MRPL8 
INTERGENIC REGION 
>ai!l078237bir}lS56849 
probable membrane protein 
YJL073w - yeast 
(Saccharomyces cerevisiae) 
>gi|895898 (X88851) 
hypothetical protein YJL073w 
Saccharomyces cerevisiae] 


6.0 


1060 


X71934 


H.sapiens XB gene . 
for tenascin-X, repeat 
XIII 


8e-05 


285207 


microtubule-associated protein, 
110K tau-rat>gi|207l58 
(M84156) big tau [Rattus ■ 
norvegicus] 


3.7 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1061 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


4049682 


(AF063866) ORF MSV092 
hypothetical protein 
[Melanoplus sanguinipes 
entomopoxvirus] 


2.L 


1062 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


8e-05 


3861019 


(AJ235271) unknown 
[Rickettsia prowazekii] 


5e-14 


1063 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


7e-05 


<NONE> 


<NONE> 


<NONE> 


1064 


L04193 


Human lens 
membrane protein 
(mpl9) gene, exon 
11. 


7e 05 


<NONE> 


<NONE> 


<NONE> 


1065 


X61609 


B.napus gene for 
LHC II Type III 
chlorophyll a/b 
binding protein 


7e-05 


2132314 


hypothetical protein YPR174c - 
yeast similarity to a nuclear 
lamin from C. elegans (PIR 
accession number S42257) 
[Saccharomvces cerevisiae] 


8.9 


1066 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


7e-05 


2979422 


(AB006757) PCDH7 (BH- 
Pcdh)c [Homo sapiens] 


5.7 


1067 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-05 


2493696 


HYPOTHETICAL 21.5 KD 
PROTEIN (ORF 185) 
>gi| 1480440 (U34204) 
ORF185; hypothetical 21.4 kJD 
protein [Brassica oleracea] i 


5.2 


1068 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


7e-05 


2501029 


PROBABLE LEUCYL-TRNA 
SYNTHETASE, 
MITOCHONDRIAL 
PRECURSOR (LEUCINE 
TRNA LIGASE) (LEURS) 
KIAA0028 [Homo sapiens] 


1.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1069 


Z68758 


iiCLjUCllLC llL/tll 

cosmid cN85E!0 on 
chromosome 22ql 1.2- 
qter 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1070 


X60653 


human Histone H3.3 
pseudogene (CIR- 
456) 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1071 


Z58294 


H.sapiens CpG DNA, 

1~1UI1C JHUU, i\Jl WttlLl 

read cpg34d6.ftla . 


3e-05 


' ' 1706241 


GUANYLYL CYCLASE GC-E 
[Mus musculus] 


9.6 


1072 


AF043251 


Homo sapiens 
mitochondrial outer 
membrane protein 
(Tom40) gene, 

encoding 
mitochondrial 
protein, exons 1 
through 6 


3e-05 


113980 


AMINE OXIDASE [FLAVIN- 
CONTAINING1 B oxidase 
(flavin-containing) (EC 1.4.3.4) 
B - human B [human, platelet, 
Peptide Partial, 520 aa] [Homo 
sapiens] 


8.9 


1073 


M31104 


Chicken progesterone 

encoding forms A and 
B. exons 1 and 2. 


3e-05 ! 


1170841 


IG GAMMA LAMBDA 
CHAIN V-II REGION 


4.8 


1074 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


543684 


ribosomal protein S3 - 

r , Klnm\/Hr\mnric Vi 11m i/T\ 1 n 

\ . 1 1 1 aj. 1 1 1 yuoiiNjiiuo iiuimtuiu 

chloroplast (fragment) 


4.2 


1075 


L22206 


Human vasopressin 
recentor V? eene 
complete cds. 


3e-05 


791207 


(U20615) Gnotl homeodomain 
protein [Gallus gallus] 


1.8 


1076 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


3e-05 


3237340 


(AF033361) polyprotein 
[Hepatitis C virus] 


0.94 


1077 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-05 


2879805 


(AL021813) hypothetical 
protein 


0.001 


1078 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-05 


3877951 


(ZSI555) predicted using 
Gene finder 


3e-07 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1079 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


2e-05 


<NONE> 


<NONE> 


<NONE> 


1080 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


3880197 


(Z81132) predicted using 
Genefinder 


2.4 . 


1081 


AF087989 


Homo sapiens full 
length insert cDNA 
clone YX29D10 


2e-05 


113667 


!!!! ALU CLASS B WARNING 
ENTRY !!!! 


1.8 • 


1082 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


474896 


(L31967) mating type protein 
[Coprinus cinereus] 


1.4 


1083 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


2266988 


(Y13274) M33 poly comb-like 
protein [Mus musculus] 


0.62 


1084 


U67415 


Equus caballus UCD- 
E-CA-467 
dinucleotide repeat 
region, complete 
sequence 


le-05 


<NONE> 


<NONE> 


<NONE> 


1085 


X67277 


H.sapiens BGP gene 
for biliary 
glycoprotein, 
promoter region and 
exon 1 


le-05 


<NONE> 


<NONE> 


<NONE> 


1086 


X85117 


H.sapiens epb72 gene 
exons 2,3,4,5,6,7 


le-05 


<NONE> 


<NONE> 


<NONE> 


lUoV 


U88328 


Mus musculus 
suppressor of 
cytokine signalling- 3 


le-05 


443877 


(Z29457) core region; 
pid:g443o// [Hepatitis C virusj 
virus] 


3.9 


1088 


Y12853 


Homo sapiens P2X7 
eene. exon 4-8 


le-05 


3878726 


(Z66498) similar to cuticle 
collagen; cDNA EST 
EMBL:D75584 comes from this 
aene 


0.36 


1089 


AE001140 


Borrelia burgdorferi 
(section 26 of 70) of 
the complete genome 


le-05 


3860719 


(AJ235270) GLUTAMYL - 
tRNA AiVIIDOTRANSFERASE 
SUBUNIT A (gatA) [Rickettsia 
prowazekii] 


4e-15 
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Nearest Neighbor CBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1090 




Hninn n.inipn^ anmmn 

11UI11U ^tlpltllo ££tllllltiu 

adaptin gene, exon 2 
and flanking intronic 

rani lanr^C 

SCCJUCllLCi 


Qe-f)6 


vlN U1N IZ> 




*>in v_»i> rz,-> 


1091 


AB000565 


Homo sapiens DNA 
for repeat sequence 
Alu . 


9e-06 


72879 


translation initiation factor EF-2 - 
Escherichia coli 


5.1 


1092 


Z78985 


H. sapiens flow-sorted 

rhrnmnisnmp f\ 

\,l 11 \J II l\J j\J 1 1 1 1- \J 

Hindlll fragment, 
SC6pA20B4 


9e-06 


' ' 159975 


(M65164) 51C surface protein 
[Paramecium tetraurelia] 


4.8 


1093 


Z21677 


i nei uioiuga rnuniiiuu 
DNA for spc operon 


9e-06 


585879 


50S RIBOSOMAL PROTEIN 
(Z21677) ribosomal protein L2 


7e-14 


1094 


AF031494 


Dhc7 (Threads) 
mRNA, complete cds 


9e-06 


729377 


DYNEIN BETA CHAIN, 

(Anthocidaris crassispina) chain 
[Anthocidaris crassispina] 


4e-18 


1095 




Hnmn <tnnif ns 

placental protein 
17al (PPI7) mRNA, 

UUIIljJICIC ^Uj 










1096 


AC001460 


Homo sapiens 
(subclone 2_f4 from 
BAC HI 07) DNA 
sequence 


4e-06 


2648304 


(AE000952)ISA1214-6. 
putative transposase 


6.2 


1097 


X85030 


1 i . J> U. U i L 1 lo 111XV1>/A 1U1 

skeletal muscle- 
specific calpain 


4e-06 


4239857 


(AB016726) calpain 
[Schistosoma japonicum] 


0.006 


1098 


M75162 


Human polymorphic 
tXiy uiiiiinc i^i 
acetyltransferase 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1099 


AB009999 


Rattus norvegicus 
mRNA for CDP- 
diacylglycerol 
synthase, complete 
cds 


3e-06 


3879045 


(Z70309) R 102.6 
[Caenorhabditis elegans] 


7.3 


1100 


Z7S985 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment, 
SC6pA20B4 


3e-06 | 


266529 


MERCURIC REDUCTASE 
(HG(II) REDUCTASE) 
>gi|418744|pir||S3016S 
mercury (II) reductase 


6.5 


1101 


. AB012L90 


Homo sapiens mRNA 
for NeddS-activating 
enzyme hUba3 7 
complete cds 


3e-06 ; 


3877938 


(Z79697) F58H10.1 
[Caenorhabditis elegans] 


6.3 
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Nearest Neighbor (BlastN vs. Genbank) 


j Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










L102 


AF041056 


WSCR4 gene, exons 
3 and 4 


3e-06 


1568583 


(Z80775) hypothetical protein 
Rv0044c 


1.9 


1103 


X00777 


Mouse E(d) beta gene 
5' flanking region and 
exon 1 


3e-06 


1680722 


(U72497) fatty acid amide 
hydrolase [Rattus norvegicus] 


0.008 i 


1104 


D21205 


Human mRNA for 
estrogen responsive 
finger protein, 
complete cds 


3e-06 


563127 


(U09825) acid finger protein 
(Homo sapiens] 


le-05 ! ! 


1105 


Z47046 


Human cosmid 
OLL2C9 from Xq28 


le-06 


' <NONE> 


<NONE> 


<NONE> 


1106 


L26261 


Human MHC class III 
HLA-RP1 gene. 


le-06 


<NONE> 


<NONE> 


<NONE> 


1107 


Ml 3402 


Rat 5S RNA gene, 
clone 5S-2. 


le-06 


<NONE> 


<NONE> 


<NONE> 


1108 


X68793 


H.sapiens gene for 
antithrombin III 


le-06 


<NONE> 


<NONE> 


<NONE> 


1109 


AF003540 


Homo sapiens 
Krueppel family zinc 
finger protein 


le-06 


2507553 


ZINC FINGER PROTEIN 33A 
(ZINC FINGER PROTEIN 
KOX31)(KIAA0065) 
(HA0946) Kruppel-related. 
[Homo sapiens] 


0.09S 


1110 


L42096 


Homo sapiens 
(subclone 10_d2 from 
P1H21)DNA 
sequence. 


le-06 


1330401 


(U58762) T27F7.1 gene product 
[Caenorhabditis elegans] 


0.015 


1111 


Z69925 


Human UNA 
sequence from 
cosmid cNl 16A5, 
between markers 
D22S2S0 and 
D22S86 on 
chromosome 22ql2 
contains EST 


9e-07 


<NONE> 


<NONE> 


<NONE> 


1112 


D90217 


S. cerevisiae gene for 
YmL33, 
mitochondrial 
ribosomal proteins of 
large subunit 


9e-07 


3879097 


(Z81I09) predicted using 
Genefinder; similar to 
sodium/phosphate transporter; 
cDNA EST yk326f6.3 comes 
from this gene; cDNA EST 
yk326f6.5 comes from this gene 
[Caenorhabditis elegans] 


7.1 
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Nearest Neighbor (BlastN vs. Genbank) 


! Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 




DESCRIPTION 


P VAT T TP 






r VALUh 












{U^H/bb) coded tor by L. 




1113 


ArUi2o99 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mJRNA, complete cds 


9e-07 


1330345 


elegans cDlNA ykJ4bl.D; coded 
for by C. elegans cDNA 
ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded 
for by C. elegans cDNA 
yk46d5.5; coded for by C. 
elegans cDNA yk43c2.5; coded 
for by C. elegans cDNA 
yk46e8.... 


2e-29 


1114 


Ar{Joo562 


Homo sapiens full 
length insert cDNA 
clone ZE16C03 


4e-07 


1072210 


(U40945) coded for by C. 
elegans cDNA yk74b9.3; coded 
for by C. elegans cDNA 
yk74b9.5; similar to repeat of 
calcium channel alpha subunits; 
similar to tetracycline resistance 
protein; similar to hypothetical 
protein in HSP30-PMP1 region 
(SP... 


3.9 


1115 


L39062 


Homo sapiens 
interleukin 9 receptor 
IL9R pseudogene, 
exons 1-9 


4e-07 


3879983 


(Z.40/y^j similar to 
transforming protein etc2; 
cDNA EST EMBL:D34 1 37 
comes from this gene; cDNA 
EST EMBL:D37172 comes 
from this gene; cDNA EST 
EMBL:D76266 comes from this 
gene; cDNA EST 
EMBL:D70493 comes from this 
gene; cDNA ... 


3.3 


1116 


Z69364 


Human DNA 
sequence from 
cosmidL96F8, 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST 
and cDNA. > :: 
emb|Z69365|HSL96F 
8 A Human DNA 
sequence from 
cosmid L96F8, 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST 
and cDNA. 


4e-07 


3493176 


(AF022SS9) latent TGF beta 
binding protein [Mus musculus] 


3.0 ! 



H-bi 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for 










1117 


D79986 


KIAA0164 gene, 
complete cds 


4e-07 


403803 1 


(AC005936) hypothetical 
protein [Arabidopsis thaliana] 


0.30 


1110 

ills 


D43950 


Human mRNA for 
KIAA0098 gene, 
partial cds 


3e-07 


<NONE> 


<NONE> 


<NONE> 




AF037168 


Arabidopsis thaliana 
DnaJ homologue 
(AU6) mRNA, 
complete cds 


3e-07 


3881075 


^axajj^co / ) predicted using 
Genefinder; similar to DnaJ 
domain ; Thioredoxin; cDNA 
EST yk433f3.5 comes from this 
gene; cDNA EST 
EMBL:D32359 comes from this 
gene; cDNA EST 
EMBL:D34721 comes from this 
gene; cDNA EST yk433f3.3 c... 


3e-09 


1120 


X69838 


H.sapiens mRNA for 
G9a 


3e-07 


3873414 


(U00043) similar to D. 
melanogaster trithorax protein 


3e-29 


1121 


AB011124 


Homo sapiens mRNA 
for KIAA0552 
protein, complete cds 


2e-07 


2618749 


(U90880) hypothetical protein 
2; predicted using XGrail 


2.0 


1122 


K03012 


Human cellular fms 
proto-oncogene, 
partial cds. 


le-07 


<NONE> 


<NONE> 


<NONE> 


1123 


AB016195 


Homo sapiens DNA, 
microsatellite and Alu 
repeat region 


le-07 


728837 


! ! ! ! ALU SUBFAMILY SQ 
WARNING ENTRY 


0.095 


1124 


Y 16795 


Homo sapiens 
DsihHaA pseudogene 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1125 


AB012624 


Homo sapiens FLU 
gene for ERGB 
transcription fuctor, 
intron 4 and partial 
cds 


4e-08 


728836 


!!!! ALU SUBFAMILY SP 
WARNING ENTRY 


3.6 


1126 


AJ 13 1341 


Homo sapiens oggl 
gene, exons 1-7 


4e-08 


1 13668 


!!!! ALU CLASS C WARNING 
ENTRY !!!! 


3e-05 


1127 


L81902 


Homo sapiens 
(subclone l_cl0from 
P1H69) DNA 
sequence 


3e-0S 


4225950 


(AJ132701) centaurin gamma IB 


1.8 


1128 


Y17968 


Gallus gallus mRNA 
for high mobility 
group 1 protein 


3e-0S 


3041855 


(AC004537) similar to tumor 
suppressor p33INGl; similar to 
AF044076 (PID:g282920S) 
Homo sapiens] 


3e-31 


1129 


Y13901 


Homo sapiens FGFR- 
4 gene 


le-OS 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ID 


Nearest 
ACCESSICtt 


Neighbor (BlastN vs. ( 
J DESCRIPTION 


Zxenbank) 
P VALUE 


| Nearest Neigh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1130 


L22024 


Mesocricetus auratus 
serum amyloid P 
component gene, 
complete cds. 


le-08 


I <NONE> 


<NONE> 


<NONE> 


1131 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


le-08 


1 <NONE> 


<NONE> 


<NONE> 


1132 


X 14034 


Human mRNA for 
phospholipase C > :: 
gb|M37238|HUMPL 
C Human 
phospholipase C 
mRNA, complete cds. 


le-08 


I <NONE> 


<NONE> 


<NONE> 


1133 


Z59381 


H.sapiens CpG DNA, 
clone 152bl0, 
forward read 
cpg!52bl0.ftla . 


le-08 ! 


<NONE> 


<NONE> 


<NONE> 


1134 


L81839 


Homo sapiens 
(subclone 2_h3 from 
PI H43) DNA 
sequence 


le-08 j 


<NONE> 


<NONE> 


<NONE> 


1135 


X14448 


Human GLA gene for 
alpha-D-galactosidase 
A (EC 3.2.1.22) 


le-08 1 


3334427 


HYPOTHETICAL PROTHIN 

MJ1207 Methanococcus 
jannaschii >gi| 1591837 
(U67562) protease synthase and 
sporulation negative regulator 
Pail, putative [Methanococcus 
jannaschii] 


9.1 


1136 


< 

AL023774 [ 


■lumin DMA 
sequence from clone 
799F15 on 
chromosome Xq25, 
complete sequence 
Homo sapiens] 


le-08 I 


( 

1354935 t 


U58330) probable copper- 
ransportins atpase 


1.2 j 


1137 


1 
r 

s 

X64639 s 


-1. sapiens DNA 
epetitive 
ubtelomeric-Iike 
equence (522 bp) 


le-08 f 


1 

77356 e 


lypothetical 70K protein - 
'ggplant mosaic virus 


0.09S j 


1138 


I 

U97058 5 


luman HuD gene, 
'UTR 


5e-09 | 


( 

3387886 s 


AF070530) unknown [Homo 
apiens] 


9.5 



Y^3 
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i Nearest Neighbor (BiastN vs. Genbank) 


1 Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


I ACCESSION 


DESCRIPTION 


P VALUE 






Human DINA 










1139 


Z82181 


sequence from 
cosmid E86D10 on 
chromosome 22. 
contains ESTs, 
exontrap, complete 
sequence 


5e-09 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


8.4 


1140 


AJ006587 


Mus musculus mRNA 
for translation 
initiation factor eEF2 
gamma X 


5e-09 


1872200 


(U22376) alternatively spliced 
product using exon 13A 


0.64 


1141 


Y11108 


H.sapiens WNT8B 
gene 


4e-09 


2854198 


(AF045646) contains similarity 
to collagens 


4.0 


1142 


AE001223 


Treponema pallidum 
section 39 of 87 of 
the complete genome 


4e-09 


3334189 


CELL DIVISION PROTEIN 
FTSY HOMOLOG 


1.5 


1143 


Z47046 


Human cosmid 
QLL2C9 from Xq28 


4e-09 


104045 


fibroblast growth factor receptor 
Al precursor - African clawed 
frog >gi|2 14894 (M55163) 
fibroblast growth factor receptor 
[Xenopus laevis] 


1.3 


1144 


AG000746 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
T171Bm40 


4e-09 


113666 


! !!! ALU CLASS A WARNING 
ENTRY !!!! 


0.33 


1145 


M74002 


Human arginine-rich 
nuclear protein 
mRNA, complete cds. 


4e-09 


3875371 


^/-.ju^'-fj/'iruiujiiii d valine Lrnu""" 
arginine rich domain, possesses 
weak similarity with the RNA 
binding domains from RNA 
splicing factor U2AF 65 KD 
subunit; cDNA EST 
EMBL:D64658 comes from this 
gene; cDNA EST 
EMBL:D66829 comes f... 
>gi|3878699|gnl|PID|e 135 1700 
possesses weak similarity with 
the RNA binding domains from 
RNA splicing factor U2AF 65 
KD subunit; cDNA EST 
EMBL:D64658 comes from this 
gene; cDNA EST 
EMBL:D66829 comes f... 


3e-06 


1146 


] 
] 

U95094 ( 


Xenopus laevis XL- 
tNCENP (XL- 
[NCENP) mRNA, 
romplete cds 


2e-09 


] 

2494337 ] 


ENDO- 1,4-BETA-XYLANASE 
PRECURSOR sp.] 


4.9 
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Nearest Neighbor (BlasiN vs. Genbank) 


I Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












UDP- 




1147 


U20554 


Drosophila 
melanogaster UDP- 
glucoserglycoprotein 
glucosyltransferase 
mRNA, complete cds. 


2e-09 


2499087 


GLUC0SE:GLYC0PR0TEIN 

GLUCOSYLTRANSFERASE 
PRECURSOR (DUGT) 
glucosyltransferase - fruit fly 
(Drosophila sp.) 
glucosyltransferase precursor 
[Drosophila melanogaster] 


4e-24 


1148 


Z56162 


H. sapiens CpG DNA, 
clone 91c9, forward 
read cpg91c9.ftla . 


le-09 


<NONE> 


<NONE> 


<NONE> 


1149 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-09 


' 1002424 


(U25739) YSPL-1 form 1 [Mus 
musculus] 


8.9 


1150 


M85276 


Homo sapiens NKG5 
gene, complete cds. 


le-09 


2315436 


(AF016447) No definition line 
found [Caenorhabditis elesans] 


8.3 


1151 


M94065 


Human 

dihydroorotate 
dehydrogenase 
mRNA, 3' end. 


le-09 


3892656 


(AB014464) MGC-24v [Mus 
musculus] 


6.2 


1152 


AJ131895 


Homo sapiens 
genomic CAG repeat 
element, clone 
60o2(250) 


5e-10 


<NONE> 


<NONE> 


<NONE> 


1153 


Z82181 


Human UNA 
sequence from 
cosmid E86D10 on 
chromosome 22. 
contains ESTs, 
exontrap, complete 
sequence 


5e-10 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


7.9 


t lC/l 

1 1 j4 


AJ224442 


Homo sapiens mRNA 
for putative 
melhyltransferase 


5e-10 


113667 


! ! ! ! ALU CLASS B WARNING 
ENTRY !!!! 


0.15 


1155 


AJO 10230 


Homo sapiens RET 
finger protein- like I 
antisense transcript, 
partial 


5e-10 


728834 


! ! ! ! ALU SUBFAMILY SB2 
WARNING ENTRY 


0.006 


1156 


AF1 1 1 1 16 


Homo sapiens 
silencer of death 
domains (SODD) 
mRNA, complete cds 


5e-10 


4160014 


CAF1 11116) silencer of death 
domains [Homo sapiens] 


2e-08 


1157 


Z97017 


-lomo sapiens mRNA 
for hypothetical 
Drotein 


4e-10 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens type II 










1158 


AF0O1298 


integral membrane 
protein 


4e-10 


<NONE> 


<NONE> 


<NONE> 


1 1 <Q 


Y11395 


H.sapiens mRNA for 
p40 


2e-10 


1000340 


(U34384) CheW [Borrelia 
burgdorferi 1 


2.4 


i i An 


U41096 


Human non-coding 
sequence upstream 
from DOC -2 gene on 
chromosome 5 


2e-10 


728837 


! ! ! ! ALU SUBFAMILY SQ 
WARNING ENTRY 


0.28 


1161 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


6e-ll 


<NONE> 


<NONE> 


<NONE> 


i i ao 
1 loZ 


Z36111 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR242w 


6e-ll 


2213560 


(Z97052) hypothetical protein 


3e-27 


1163 


D89174 


Schizosaccharomyces 
pombe mRNA. partial 
cds, clone: SY 1004 


6e-ll 


3879758 


{£WZZU) Similarity to yeast 
protein TREMBL ID E246895); 
cDNA EST EMBL:T00018 
comes from this gene; cDNA 
EST EMBL:C 13908 comes 
from this gene; cDNA EST 
EMBL:C 11656 comes from this 
gene; cDNA EST yk234a5.3 
comes from this ge... 


4e-30 


1 1 A/1 


Z95437 • 


Human DN A 
sequence from 
cosmid Al on 
chromosome 6 
contains ESTs. 
HERV like retroviral 
sequence | 


5e-ll 


<NONE> 


<NONE> 


<NONE> 


1165 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


5e-ll 


3886065 


(AF106581) contains similarity 
to C4-type zinc fingers 


4.9 


1166 


X56997 i 


Human UbA52 gene 
:oding for ubiquitin- 
52 amino acid fusion 
Drotein 


2e-ll i 


<NONE> 


<NONE> 


<NONE> 


1167 


3 
1 

AF086253 < 


Homo sapiens full 
ength insert cDNA 
;lone ZD40G12 


2e-ll 


21347S0 


lpoptosis inhibitor IAP homolog 
human 


3.8 



V3t 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1168 


AB018314 


Homo sapiens mRNA 
for KIAA0771 
protein, partial cds 


2e-ll 


3024343 


P53-BINDING PROTEIN 
53BP2 Bbp/53BP2 [Homo 
sapiens] 


2e-ll 


1169 


274972 


S.cerevisiae 
chromosome XV 
reading frame ORF 
YOR064c 


2e-ll 


3041855 


(AC004537) similar to tumor 
suppressor p33INGl; similar to 
AF044076 (PID:g2829208) 
[Homo sapiens] 


2e-40 


1170 




Human UNA 
sequence from 
cosmid E86D10on 
chromosome 22. 
contains ESTs, 
exontrap, complete 


7e-12 


^l^J Ul 1 ! C^-* 






1171 


X77738 


H.sapiens red cell 

anion pxchanppr 

dill Kj 1 1 ^AV#UU1J&VI 

(EPB3, AE1, Band 3) 
gene, 3' region 


7e-12 


2135416 


hypothetical protein - human 
>gi|288145 


0.012 


1172 


S61977 


medium-chain acyl- 
CoA dehydrogenase 
{exon 10, intron 10} 
rhnmnn fifnnmic 

I^IIUlllMil, VJ^IIVJllllV, 

1407 nt] 


6e-12 


113666 


" " ALU CLASS A WARNING 
ENTRY !!!! 


0.100 


1173 


X66285 


M.musculus DNA for 
HC I locus 


6e-12 


854065 


(X83413) U88 [Human 
herpesvirus 6] 


2e-06 


1174 


S78744 


protein S=activated 
protein C cofactor 
[rats, liver, mRNA, 
3315 nt] | 


6e-12 j 


2338292 


(AF009243) proline-rich Gla 
protein 2 [Homo sapiens] 


3e-l0 


1175 


X58474 


Bovine OXT <>ene for 
oxytocin, 5' \ 
noncoding region 


2e-12 


1296429 


(L77967) small proline-rich 
protein with paired repeat 


4.1 


1176 


Z56314 


H.sapiens CpG DNA, 
clone lOhlO, reverse 
read cpglOh lO.rtla . 


2e-12 


2935221 


(AF030154) pVII [bovine 
adenovirus type 3] 


2.8 


1177 


Z56314 


H.sapiens CpG DNA, 
clone lOhlO, reverse 
read cpglOhlO.rtla . 


2e-12 


2708659 


(AF037440) putative 26 kDa 
protein [Edwardsiella ictaluri] 


2.8 


1 178 


Z19543 


M.musculus h2- 
calponin cDNA 


2e-12 


2497945 


BETA SCRUIN >gi| 1015535 
(Z47541) beta scruin [Limulus 
polyphemus] 


2e-0-t 



¥37 
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Nearest Neighbor (BlastN vs. Genbank) 


i Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






erythropoietin 










1179 


S45332 


receptor [human, 
placental, Genomic, 
8647 nt] 


7e-13 


728835 


!!!! ALU SUBFAMILY SC 
WARNING ENTRY 


0.074 


i 180 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-13 


<NONE> 


<NONE> 


<NONE> 


1181 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-13 


<NONE> 


<NONE> 


<NONE> 


1 1 so 


Z59509 


H.sapiens CpG DNA, 
clone 15a 1, reverse 
read cpgl5al.rtla . 


2e-13 


3150251 


(AL023634) hypothetical 
protein 


0.66 


UOJ 


D10170 


Human CYP11B2 
gene for steroid 18- 
hydroxylase 


2e-13 


728837 


! ! ! ! ALU SUBFAMILY SQ 
WARNING ENTRY 


3e-05 


1184 


U65416 


Human MHC class I 
molecule (MICB) 
gene, complete cds 


2e-13 


126295 


LINE-1 REVERSE 
TRANSCRIPTASE | 
HOMOLOG 


6c- 11 


1185 


AJ00603 1 


Mus musculus 
IHABP gene, 
promoter 


8e-14 


2132223 


hypothetical protein YPLl86c - 
yeast 


1.1 


1186 


U34976 


Human gamma- 
sarcoglycan mRNA, 
complete cds 


8e-14 


1054903 


(U34976) gamma-sarcoglycan 
;Homo sapiens] >gi|4239660 
sapiens] 


0.034 


1187 


D30647 i 


Rat mRNA for very- 
long-chain Acyl-CoA 
dehydrogenase, 
complete cds 


8e-14 


3183512 


ACYL-COA 

DEHYDROGENASE, VERY- 
LONG-CHAIN SPECIFIC 
(VLCAD) >gi|2388724 
(AF017176) very- long-chain 
acyl-CoA dehydrogenase [Mus 
musculus] 


8e-23 


1188 


Z63247 


H.sapiens CpG DNA, 
clone 7g4, forward 
read cpg7g4.fla . 


6e-14 


86285 


histone HI. 01 - chicken 


6.8 


1189 


U27196 


Gallus gallus zinc 
finger protein (Fzf-1) 
mRNA, complete cds. 


3e-14 


2134436 


zinc finger protein - chicken 
(fragment 


4e-10 


1190 


M26219 


African green 
monkey origin of 
replication 


2e-14 


<NONE> 


<NONE> 


<NONE> 



^3* 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1191 


API 00694 


TDj-v^t- i r-i ^ ^ -.DMA 

complete cds 


2e-I4 


4235641 


(AF1 19040) NL0D 
[Lycopersicon esculentum] 


0.65 


1192 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
proicin precursor 
mRNA, complete cds 


2e-l4 


3043728 


(AB01 1 174) KIAA0602 protein 
[Homo sapiens] 


0.28 


1193 


fVJUUjoOO 


Homo sapiens mRNA 
for putative Sqv-7- 
like protein, partial 


ze- 14 


4008517 


(AJ005866) Sqv-7-Iike protein 
[Homo sapiens] 


0.004 


1194 


U32709 


Haemophilus 

lllllUCilZaC Iv Li bCCilUn 

24 of 163 of the 
complete genome 


2e-14 


3861056 


(AJ235272) 

POLYRIBONUCLEOTIDE 
N UCLLO 1 ID YLTRANSFERA 
SE (pnp) [Rickettsia 
prowazekii] 


6e-28 


1195 


AF073485 


Homn "inni^n^ \/TT-Tf 

class I-related protein 
MR I precursor 
(MRl) gene, partial 
cds 


8e-15 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


1.0 


1196 


AF052135 


23625 mRNA 
sequence 


8e-15 


4098124 


(U73522) AMSH [Homo 
sapiens] 


8e-14 


1197 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-15 


<NONE> 


<NONE> 


<NONE> 


1198 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-15 


113671 


- • . - r\_L, KJ V_JL^i-\00 i WrVIVi\ UNU 

ENTRY !!!! 


1.7 


1199 


Z75104 


S.cerevisiae 
chromosome XV 
reading frame ORF 
YORl96c 


3e-15 


3878570 


(Z46381) similar to lipoic acid 
synthase; cDNA EST yk283b6.3 
comes from this gene; cDNA 
EST yk283b6.5 comes from this 
gene; cDNA EST yk472f5.3 
comes from this gene; cDNA 
EST yk472f5.5 comes from this 
gene; cDNA EST yk476e7.3... 


le-15 



*i9 
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I Nearest Neighbor (BlastN vs. Genbank) 


\ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


' DESCRIPTION 


P VALUE 












(U42833) coded tor by C. 




1200 


X7005^ 


S.cerevisiae sofl 
gene 


3e-15 


1 125754 


elesans cDNA cml6f6* coded 
for by C. elegans cDNA 
CEESU63F; similar to S. 
cerevisiae SOF1 protein 
(SP:P33750) [Caenorhabditis 




1201 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-15 


' ' <NONE> 


<NONE> 


<NONE> 


1202 


M92295 


Gorilla gorilla gamma 
1 and gamma-2 

<?lohin fffnps 

complete cds. 


le-15 


284078 


U J JJWtl IClIv^tll JJ1L/LC1II 11 Ul nun 

>gi| 182220 


7.4 


1203 


L34587 


Homo sapiens RNA 
polymerase II 
elongation factor SHI, 
p!5 subunit mRNA, 
complete cds. > :: 
gb|AR022286|AR022 

0Rf\ ^^niipnpp 7 fr*nm 

i. UU JWUUtll^b 1 11 Will 

patent US 5792634 


9e-16 


<NONE> 


<NONE> 


<NONE> 


1204 


D83649 


Xenopus laevis 
mRNA for xSox7 
protein, complete cds 


8e-16 


2447043 


(D83649) xSox7 protein 
^Xenopus laevis] 


4e-06 


1205 


AC005190 


Homo sapiens PAC 
clone DJ1152D16 
from Xq23; complete 
sequence [Homo 
sapiens] 


3e-16 


<NONE> 


<NONE> 


<NONE> 


1206 


J03626 


Human UMP 
synthase mRNA, 
complete cds. 


3e-16 


113667 


! ! ! ! ALU CLASS B WARNING 
ENTRY !!!! 


0.65 


1207 


J00083 


Human Alu family 
interspersed repeat; 
clone BLUR11. 


3e-16 


728836 


!!!! ALU SUBFAMILY SP 
WARNING ENTRY 


4e-06 


1208 


U70674 


Mus musculus m- 
Numb (m-nb) mRNA. 
complete cds 


le-16 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1209 


U66619 


Human SWI/SNF 
complex 60 KDa 
subunit (BAF60c) 
mRNA, complete cds 


le-16 


1549247 


(U66619) SWI/SNF complex 60 
KDa subunit [Homo sapiens] 


0.003 


1210 


U75467 


Drosophila 
melanogaster Rga and 
Atu genes, complete 
cds 


le-16 


1658503 


(U75467) Atu [Drosophila 
melanogaster] 


5e-32 


1211 


M72709 


Human alternative 
splicing factor 
mRNA, complete cds. 


3e-17 


<NONE> 


<NONE> 


<NONE> 


1212 


U26556 


Human ferritin H 

(FTHL13) 

pseudogene. 


3e-I7 


<NONE> 


<NONE> 


<NONE> 


1213 


D32064 


Human gene for 2- 
oxoglutarate 
dehydrogenase, 
complete cds 


3e-17 


2088843 


(AF003386) F59E12.9 gene 
product [Caenorhabditis 
elesans] 


0.12 


1214 


M76364 


Human (Papua New 
Guinean) 

Mitochondrial DNA 
control region, 
sequence 131. 


3e-17 


1 14009 


APAG PROTEIN 
>gi|72927|pir||BVECAG apaG 
protein - Escherichia coli 
>gi|40918(X04711)URF 
hypothetical protein 
[Escherichia coli] 


0.006 


1215 


AFO 17466 


Homo sapiens 
genomic sequence 
from subtelomeric 
region of 
chromosome 4q 


le-17 


3947985 


(U78948) MADS-box protein 2 
[Malus domestical 


4.1 


1216 


AF004876 


Homo sapiens 
54TMp (54tm) 
mRNA, complete cds 


le-17 


4101574 


(AF004876) 54TMp [Homo 
sapiens] 


0.006 


1217 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-18 


<NONE> 


<NONE> 


<NONE> 


1218 


AF086758 


Rattus norvegicus Na- 
K-2CI cotransporter 


4e-18 


3892703 


(AL033545) putative glycine- 
rich protein [Arabidopsis 
thaliana] 


0.30 


1219 


AF020089 


Homo sapiens 
PEN11B mRNA. 
complete cds 


4e-18 


2642493 


(AF02391O) DNA 
topoisomerase I [Physarum 
polvcephalum] 


0.0S3 


1220 


X82333 


H. sapiens IRLB gene 
(exonl-3) 


4e-18 


106837 


irlB protein - human (fragment) 
>gi|33969 


2e-ll 1 



WO 01/02568 



PCT/US00/18374 





! Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for 










1221 


AB002383 


KIAA0385 gene, 
complete cds 


4e-18 


3228540 


(AF060181) zinc finger protein 
[Homo sapiens] 


6e-25 




I X98485 


P.vivax PV14 gene 


le-18 


<NONE> 


<NONE> 


<NONE> 


1223 


Z79057 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment, 
SC6pA21E8 


le-18 


2981631 


(AB012223) ORF2 [Canis 
familiaris] 


0.001 


1224 


L01457 


Homo sapiens (clone 
JH4Bl)PM-scl 
autoantigen mRNA, 
complete cds. 


le-18 


346287 


nucleolar 100K polymyositis- 
scleroderma protein - human 
>gi|35555 (X66113) PM/Scl 
iOOkD nucleolar protein [Homo 
sapiens] 


0.001 


1225 


L02897 


Dog nonerythroid 
beta-spectrin mRNA, 
3' end. 


4e-I9 


3493358 


(ABO 17037) nonstructural 
protein precursor [Himetobi P 
virus] 


0.12 


1226 


AB012162 


Homo sapiens mRNA 
for APCL protein, 
complete cds 


4e-19 


3894265 


(AB012162) APCL protein 
[Homo sapiens] 


0.002 


1227 


AB011093 


Homo sapiens mRNA 
for KIAA0521 
protein, partial cds 


4e-19 


3043566 


(ABO 11093) KIAA0521 protein 
[Homo sapiens] 


9e-09 


J. -i-iiO 


X78454 


X.laevis AB21 
mRNA for RPD3 
homolo2ue 


4e-19 


3023945 


HISTONE DEACETYLASE 
(HD) thaliana] 


5e-34 


1229 


U88895 


Human endogenous 
retrovirus H Dl 
leader 

region/integrase- 
derived ORE1, 
ORF2, and putative 
envelope protein 
mRNA, complete cds 


2e-19 


59977 


(Z 143 10) tripartite fusion 
transcript PLA2L [Human 
endogenous retrovirus] 


le-04 


1230 


U34377 


Human tyrosine 
Kinase iXK(txK) 
gene, exon 13. 


le-19 


728831 


! ! ! ! ALU SUBFAMILY J 
WARNING ENTRY 


3e-05 


1231 


X72966 


M.musculus rab3A 
sene 


le-19 


2408076 


(Z99167) putative peroxisomal 
organisation and biogenesis 
xotein [Schizosaccharomyces 
pombe] 


2e-09 


1232 


AB007953 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA04S4 


4e-20 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 



SEQ 
i ID 


Nearest 
ACCESSION 


Neighbor (BlastN vs. ( 
f DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neishl 
ACCESSION 


oor (BlastX vs. Non-Redundant Pi 
DESCRIPTION 


P VALUE 


1233 


D 14034 


Human gene for Zn- 
alpha2-gIycoprotein, 
complete cds 


2e-20 


3928756 


(AB00153:)) similar to 
C.elegans hypothetical protein 
CETO 1 H8. 1 XTEC05C 12.3 ,CEF5 
4D1.5. similar to trp and trp-like 
proteins [Homo sapiens] 


le-07 


1234 


X82126 


ri. sapiens r\\j)s.- z. 
gene, exon 2 


2e-20 


2137269 


DNA-binding protein - mouse 
>gi|437444 


le-19 


1235 


AF093684 


Luciferase reporter 
vector pXP2 *SA V 
complete sequence 


5e-21 


2773363 


(AF041382) microtubule 
bindins protein D-CLIP-190 


5.5 


1236 




Human IMP 
dehydrogenase type 1 
mRNA complete cds. 


5e-21 


124417 


INOS1NE-5'- 
MONOPHOSPHATE 
DEHYDROGENASE 1 (IMP 
DEHYDROGENASE 1) 
(IMPDH-I) (IMPD 1) I - human 


2e-04 


1237 


D86997 


Human (lambda) 
DNA for 

immunoglobulin light 
chain 


5e-21 


3878261 


(Z75712) Similarity to S. Pombe 
BEM1/BUD5 suppressor; 
cDNA EST EMBL Z14470 
comes from this gene; cDNA 
EST yk482d4.3 comes from this 
gene; cDNA EST yk482d4.5 
comes from this gene 
[Caenorhabditis elezans] 


6e-46 


1238 


< 
] 

Z79865 : 


H. sapiens 

:hromosome 22 CpG 
sland DNA genomic 
vlsel fragment, clone 
502f3, forward read 
$02f3.f 


2e-21 


< 
i 

\: 

2739037 U 


(AF024614) .ADAM 10 
[Caenorhabditis elegans] Zinc- 
Dinding metalloprotease domain; 
:DNA EST CEMSA42F comes 
from this gene; cDNA EST 
/k218f3.3 comes from this gene; 
:DNA EST yk443d9.3 comes 
rom this gene; cDNA EST 
^k443d9.5 comes from this 
rene; cDNA... 


2.6 



4^1 
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1 Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


|accessio> 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












= U'u-uvujujj simiuu lu laiuiiiiu 




1239 




Mus musculus 
Pontin52 mRNA, 
complete cds 


6e-22 


3924779 


- D, lDNA EST y k4j0a3.J CDTng 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk2 19a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes fr... 

>gi|392488 1 |gnl|PID|e i 354569 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes from... 


0.35 


1240 


U67824 


Human primary Alu 
transcript 


6e-22 


728832 


! ! ! ! ALU SUBFAMILY SB 
WARNING ENTRY 


5e-07 


1241 


AF070636 


Homo sapiens clone 
24686 mRNA 
sequence 


2e-22 


98710 


fatty-acid synthase (EC 
2.3.1.85) - Brevibacterium 
ammoniasenes 


2.5 


1242 


D 14034 


Human gene for Zn- 
alpha2-glycoprotein, 
complete cds 


2e-22 


1 

4185939 


(Y17832) pol protein [Human 
endogenous retrovirus K] 


0.29 


1243 J 


M61835 


Human lactase 
phlorizin hydrolase 
(LCT) gene, exon 2. 


2e-22 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


0.006 


12441 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


6e-23 


1350828 


RABPHILIN-3A 
>gi|477 1 00|pir|| A48097 
rabphilin-3A - bovine 
>gi|285646|gnl|PID|d 1003285 


0.14 


1245 


AF074985 


numo sapiens run 
length insert cDNA 
YH73H06 


8e-24 


i 

3170548 i 


'AF056116) unknown [Fugu 
"ubripes] 


0.24 


1246 I 


] 
I 

D14878 < 


Human mRNA for 
Drotein DI23, 
:omplete cds 


7e-24 


<NONE> 


<NONE> 


<NONE> 


1247 1 


I 

r 

D16917 1 


-luman HepG2 3* 
egion cDNA, clone 
imd3d07 


6e-24 


( 
r 

( 

( 

1397345 e 


U61955) contains multiple 
egion of strong similarity to 
"2H2-type zinc fingers 
PS:PS0002S) [Caenorhabditis 
Iegans] 


2.4 
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1 Nearest Neishbor (BlastN vs. Genbank) 


| Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 




UtoCKLr 1 iUIN 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1248 


Z69654 


sequence from 
cosmid L98A6, 
Huntington's Disease 
Region, chromosome 
4pl6.3. 


3e-24 


4240566 


(API 23462) neurexin III [Homo 
sapiens] 


4.5 


1249 


AB007914 


Homo sapiens mRNA 
for KIAA0445 
protein, complete cds 


2e-24 


3885949 


(AF095568) amelogenin 
[Paleosuchus palpebrosus] 


j 3.2 


1250 


AF088072 


Homo sapiens full 
length insert cDNA 
clone ZD93D10 


2e-24 


323091 


immunodominant microneme 
protein EtpIOO - Eimeria tenella 
>gi|2707733 (AF032905) 
microneme protein precursor 
Etmic-1 [Eimeria tenella] 


0.34 


1251 


AF069489 


Homo sapiens cAJVLP 
specific 

phosphodiesterase 4A 
variant pde46 
(PDE4A) gene, exons 
2 through 1 3 and 
alternative splice 
exons 3a, 6a, 6b, and 
9a 


2e-24 


728836 


! 1! ! ALU SUBFAMILY SP 
WARNING ENTRY 


le-05 


1252 


Y12853 


Homo sapiens P2X7 
gene, exon 4-8 


9e-25 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


le-05 




M27830 


Human 28S 
ribosomal RNA gene, 
complete cds. 


8e-25 


<NONE> 


<NONE> 


<NONE> 


1254 


AB007953 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA04S4 


8e-25 


<NONE> 


<NONE> 


<NONE> 


1255 


Z60212 


H. sapiens CpG DNA, 
clone 195c 8, forward 
read cpgl95c8.ftla . 


8e-25 


158154 


(M81959) POU domain protein 
[Drosophila melanogaster] 


3.3 


1256 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-25 


<NONE> 


<NONE> 


<NONE> 


1257 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


7e-25 


<NONE> 


<NONE> 


<NONE> 


1258 


Y12851 


Homo sapiens P2X7 
gene, exon 1 and 
oined CDS 


2e-25 


<NONE> 


<NONE> 


<NONE> 



¥45 
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Nearest Neighbor ( BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus Tera 










1259 


U64033 


(Tera) mRNA, 
complete cds 


9e-26 


<NONE> 


<NONE> 


<NONE> 


1260 


U19181 


Rattus norvegicus 
Rabin3 mRNA, 
complete cds. 


9e-26 


624225 


(U19181)Rabin3 [Rattus 
norvegicus] 


le-13 




AF020788 


Caenorhabditis 
elegans SEL-10 (sel- 
10) mRNA, complete 
cds 


9e-26 


3915881 


1 U h'KU l hLN Candida 
CDC4 gene (TR:E234056); 
cDNA EST EMBL:D27699 
comes from this gene; cDNA 
EST EMBL:D27698 comes 
from this gene; cDNA EST 
EMBL:D32793 comes from this 
gene; cDNA EST 
EMBL:D33271 comes from this 
gen... 


7e-32 


1262 


AB016930 


Cricetulus griseus 
mRNA for 

Phosphatidylglycerop 
hosphate synthase, 
complete cds 


8e-26 


4159682 


(ABO 16930) 

Phosphatidylglycerophosphate 
synthase [Cricetulus griseus] 


0.045 


1263 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-26 


3878629 


(Z93385) predicted using j 
Genefinder; Similarity to 
B.subtilis GTP-bindins protein 


2e-10 


1264 


X91I95 


H.sapiens SOM172 
mRNA 


le-26 


<NONE> 


<NONE> 


<NONE> 


1265 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-26 


1360637 


(X95995) ENBP1 [Vicia sativa] 


3.1 


1266 


L08237 


Human MG2 1 
mRNA. partial cds. 


le-26 


950411 


(L08237) located at 0ATL1 
[Homo sapiens] 


9e-09 


1267 


AF 100694 ( 


Mus musculus 
Pontine 2 mRNA, 
:omplete cds 


9e-27 


< 

3881080 


(AL032657) similar to EGF-like 
domain; cDNA EST yk299al2.3 
comes from this gene; cDNA 
EST EMBL:D35398 comes 
from this gene; cDNA EST 
yk33lh6.5 comes from this 
sene; cDNA EST yk299al2.5 
:omes from this gene; cDNA 
EST yk467srS.... 


0.00 1 


1268 


] 
] 

AF100694 < 


VIus musculus 
Pontin52 mRNA, 
:omplete cds 


Se-27 


] 

1731324 : 


4YPOTHETICAL PROTEIN 
>gi|l66306 


4.0 
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Nearest Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1269 


X89211 


H.sapiens DNA for 
cnuugenoub retroviral 
like element 


8e-27 


2065209 


^ i i-i / ij) wag poiyproiein [ivius 
musculus] 


0.005 


1270 


U73166 


Hnmn <snnien^ cn^miri 

llVlliv JUL/I^lio VvjIIUU 

clone LUC A 15 from 
3p21.3, complete 
sequence [Homo 

c in i r» c 1 
atipiClib j 






!!!! ALU SUB FAMILY J 

WAT? MTM r"r PNTRY 


*+e-Lp t * > 


1271 


H7R9SS 
vj i OZ.JJ 


Mouse mRNA for 

rnx*l, Lumpicic cus 




IOJUU70 


(D78255) PAP-1 [Mus 
muscuiusj 


ze- iu 


1272 




Mus musculus 
Pontin52 mRNA, 


1 f=07 




spermatophorin Sp23 - yellow 
mealworm molitor] 


O 


1273 


ABO 15202 


Homo sapiens gene 

mr n mnnrn 1 o t n pYon 
ivji ii i jjpu^aik, n i, CAuu 

2, 3 and complete cds 


le-27 


3877698 


(Z83318) predicted using 
Genefinder; cDNA EST 

\/t ^^Q<»7 S i^rtmpc fmm tnic errant* 

yKjxjyc/.j cumes irum inib gene 
[Caenorhabditis elegans] 


0.37 


1274 


AF 100694 


Mus musculus 

PontinS^ mRNA 

complete cds 


le-27 


3328188 


( AF07d.QO">"\ Inminin alnl-n fhain 

[Caenorhabditis elegans] 


0.19 


1275 


Z29336 


H.sapiens gene for 
Cu/Zn-superoxide 
dismutase 


le-27 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


6e-05 


1276 


AF 100694 


Mus musculus 
complete cds 


9e-2S 


2133579 


aucr rTiu-iupiiuriii op-j y enow 

mealworm molitor] 


9.2 


1277 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-2S 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.054 


1278 


AB001636 


Homo sapiens mRNA 
for ATP-dependent 
RNA helicase #46, 
complete cds 


4e-28 


3913425 


PUTATIVE PRE- MRNA 
SPLICING FACTOR ATP- 
DEPENDENT RNA 
HELICASE >gi|2275203 
(AC002337) RNA helicase 
isolog [Arabidopsis thaliana] 


3e-22 


1279 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thaliana] 


0.066 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 1 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AC005990) Contains repeated 




1280 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


4e-05 


1281 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1282 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1283 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1284 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1285 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1286 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1287 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


140505 


PROBABLE INTRON 
MATURASE liverwort 
(Marchantia polymorpha) 
chloroplast >gi|l 1663 


3.0 


1288 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


140505 


PROBABLE INTRON 
MATURASE liverwort 
(Marchantia polymorpha) 
chloroplast >si|l 1663 


1.8 


1289 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2133579 


sperrnatophorin Sp23 - yellow 
mealworm molitor] 


0.50 


1290 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thaliana] 


0.0S7 


1291 


Z63029 


H.sapiens CpG DNA, 
clone 77 b3, forward 
read cpg77b3.ftla . 


le-28 


2493240 


HYPOTHETICAL 29.3 KD 
PROTEIN pseudotsugata 
nuclear, polyhedrosis virus] j 


0.014 
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SEQ 
ID 


1 Nearest 
ACCESSIOr- 


Neighbor (BlastN vs. ( 
{ DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1292 


1 AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-23 


118588 


DEHYDRIN DHN3 " 
>gi| 1 00035|pir||S 1 8 139 dehydrir 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativum] 


i 

0.010 


1293 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


ie-28 


4056454 


(AUJUSyyO) Contains repeated 
region with similarity to 
gb[U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


0.007 


1294 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACOODyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Axabidopsis thaliana] 


0.002 


1295 


AFt 00694 


Mus musculus 
Pontin52 rnRNA, 
complete cds 


le-28 


126363 


LAMININ ALPHA- 1 CHAIN 
PRECURSOR precursor - 
human 


3e-04 


1296 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUO^yyuj Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


le-04 


1297 1 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUUDyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana, 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


3e-05 


1298 1 


j 
] 

AF1 00694 c 


Vlus musculus 
Pontin52 mRNA, 
:omplete cds 


le-28 


t 

1 

3157926 t 


[AC002131) Strong similarity to 
2Xtensin-like protein gb|Z34465 
Tom Zea mays. [Arabidopsis 
haliana] 


2e-05 


1299| 


r 

1 

AF 1 00694 c 


vlus musculus 
> ontin52 mRNA, 
omplete cds 


le-28 


< 
r 

( 

c 

I 

C 

4056454 


AC005990) Contains repeated 
egion with similarity to 
zb|U43627 extensin (atExtl) 
^ene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
ome from this gene. 
Arabidopsis thaliana] 


le-05 
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SEQ 
ID 


1 Nearest 
1 

1 ACCESSION 


Neighbor (BlastN vs. i 
4 DESCRIPTION 


3enbank) 
P VALUE 


1 Nearest Neieh 
j ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1300 


1 AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


1 320919 


Trypanosoma cruzi >gi|I62142 
(M25364) kinetoplast-associatec 
protein 


1 

le-07 


1301 


API 00694 


Mus rnusculus 
Pontin52 mRNA, 
complete cds 


le-28 


J 4056454 


(ALUU0990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 

2ene from Amhidor>«si<: thnlinnn 

ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


9e-08 


1302 


AF1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AUUU^yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 

2ene from Arahidr>n<;i«; thali^m 

ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


Ie-09 


1303 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 | 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
2cne from ArnbidorKK rhnlinnn 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thalianal 


9e-10 


1304 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 1 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[ Arabidopsis thaliana] 


4e-10 


1305 


i 
J 

AF 100694 ( 


Mus musculus 
Pontin52 mRNA, 
:omplete cds 


le-28 I 


] 

c 

4056454 [ 


(ACU05990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
:ome from this gene. 
Arabidopsis thaliana] 


9e-ll 


| 1306 1 


P 
F 

AF 100694 c 


vlus musculus 
5 ontin52 mRNA, 
omplete cds 


le-28 1 


( 
r 

c 

c 
c 

I 

c 

4056454 


AC00599C) Contains repeated 
egion with similarity to 
rb|U43627 extensin (atExtl) 
jene from Arabidopsis thaliana. 
iSTs gb|Z34165 and gb|ZlS78S 
ome from this gene. 
Arabidopsis thaliana] 


6e-ll 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) i 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1307 


AF 100694 


Pontin52 mRNA, 
complete cds 


4e-29 


<NONE> 


<NONE> 


<NONE> 


1308 


AF079529 


Homo sapiens cAMP- 
specific 

phosphodiesterase 8B 


4e-29 


<NONE> 


<NONE> 


<NONE> 


1309 


X93334 


H.sapiens 

mitochondrial DNA, 
complete aenome 


4e-29 


116977 


CYTOCHROME C OXIDASE 
POLYPEPTIDE I chain I - 
human mitochondrion (SGC1) 
>gi| 13006 (V00662) cytochrome 
oxidase I [Homo sapiens] 
>gi|506829 (JO 14 15) 
cytochrome oxidase subunit 1 
[Homo sapiens] sapiens] 


3e-09 


1310 


AF020760 


Homo sapiens serine 
protease (Omi) 
mRNA, complete cds 


4e-29 


2738915 


(AF020760) serine protease 
[Homo sapiens] 


8e-12 


1311 


U95097 


Xenopus laevis 
mitotic 

phosphoprotein 43 
mRNA, partial cds 


4e-29 


2072294 


(U95097) mitotic 
phosphoprotein 43 [Xenopus 
laevis] 


le-25 


1312 


L32162 


Homo sapiens 
transcription factor 
mRNA, 5' end. 


2e-29 


2501706 


RENAL TRANSCRIPTION 
FACTOR KID-1 finger protein 
[Mus musculus] 


8e-15 


1313 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-29 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34I65 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


le-04 


1314 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-29 


1169643 


FMRFAM IDE -RELATED 
NEUROPEPTIDES ! 
PRECURSOR >gi|4 16208 
(U03137) neuropeptide 
precursor FMRFamide- related 
peptide [Lymnaea stagnalis] 


le-05 


1315 


U50839 


Homo sapiens g!6 
protein (gl6) mRNA, 
complete cds 


le-29 


3212101 


(AF069517) RNA binding 
protein DEF-3 [Homo sapiens] 


6e-10 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neishbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












intercellular adhesion molecule 




1316 


X697 1 1 


H. sapiens mRNA for 
ICAM-R 


5e-30 


299356 


3, AJvi-->=iympnocyte 
function-associated antigen I 
counter-receptor homolog 
[human, tonsil, Peptide Partial, 
518 aa] 


3e-08 


1317 


AFO 10227 


Homo sapiens 
receptor- associated 
coactivator 3 


5e-30 


2331250 


(AFO 12 108) Amplified in Breast 
Cancer [Homo sapiens] 


8e-09 




AF086395 


Homo sapiens full 
length insert cDNA 
clone ZD75C01 


2e-30 


3861241 


(AJ235273) CELL SURFACE 
ANTIGEN (sca5) 


4.2 


1319 


M27830 


Human 28S 
ribosomal RNA gene, 
complete cds. 


2e-30 


1730522 


PHOSPHOGLYCERATE 
KINASE 2.7.2.3) - Pyrococcus 
woesei >gi| 1054832 (X73527) 
phosphoglycerate kinase 
[Pyrococcus woesei] 


3.8 


1320 


M79307 


Mouse GTP-binding 
protein (Rabl7) 
mRNA sequence. 


2e-30 


464564 


d a c dct a Ten r>r> n'i'UTXT 
KAb-KEL A 1 ED rKU I C.US 

RAB-17 RabI7 - mouse 

(fragment) >gi[297157 

(X70804) rabl7 [Mus musculus] 


9e-ll 


1321 


AL022168 


Human DNA 
sequence from clone 
U247E12 on 
chromosome Xq22- 
23, complete 
sequence [Homo 
sapiens] 


le-30 


2072967 


(U93570) putative pl50 [Homo 
sapiens] 


3e-ll 


1322 


X85124 


M.musculus pacsin 
gene 


le-30 


2217964 


(Z50798) p^2 [Gallus gallus] 


le-34 


1323 


U37408 


Homo sapiens 
phosphoprotein CtBP 
mRNA, complete cds 


5e-31 


74518 


structural polyprotein - i 
Venezuelan equine encephalitis 

virtic ^Qtnin TRD^ "!>cnl'}'"M7 1 0 
Vlllii ^aLlulll HvL/J 1 1 ^/ _ / 1 \J 

(J04332) poly-envelope protein 
[Venezuelan equine encephalitis 
virus] 


LI 


1324 


L04193 I 


Human lens 
membrane protein 
(mpl9) gene, exon 
11. 


2e-3l 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


7e-07 


1325 


M11167 \ 


Human 28S 
ribosomal RNA gene. 


6e-32 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1326 


M33336 


Human cAMP- 
dependent protein 
kinase type I-alpha 
subunit (PRKAR1A) 
mRNA, complete cds 


2e-32 


<NONE> 


<NONE> 


<NONE> 


1327 


J03060 


Human 

glucocerebrosidase 
pseudogene, complete 
cds 


2e-32 


2144479 


glucosylceramidase (EC 
3.2.1.45) precursor - human 


le-05 


1328 


U33053 


Human lipid- 
activated protein 
kinase PRK1 mRNA, 
complete cds 


7e-33 


2137689 


protein kinase (EC 2.7.1.37) - 
mouse 


le-14 


1329 


J04617 


Human elongation 
factor EF-1 -alpha 
gene, complete cds. > 
::dbj|E02629|E02629 
DNA of human 
polypeptide chain 
elongation factor- 1 
alpha 


6e-33 








1330 


L40396 


Homo sapiens (clone 

s22i71)mRNA 

fragment 


6e-33 


124235 


INTERMEDIATE FILAMENT 
PROTEIN B protein B - 
common roundworm 


1.00 


1331 


Z72813 


S.cerevisiae 
chromosome VII 
reading frame ORF 
YGR028w 


6e-33 




MSP1 PROTEIN HOMOLOG 
Yeast MSP 1 protein (TAT- 
uinaing nomoios -4; 


oeou 


1332 


AB007941 


Homo sapiens mRNA 
for KIAA0472 
protein, partial cds 


2e-33 


1150834 


(U42471) Wiscott-Aldrich 
oyiiurumc proiein nomoiog 
[Mus musculus] 


2.0 


1333 


AF044574 < 


Rattus norvegicus 
Dutative peroxisomal 
2,4-dienoyI-CoA 
reductase (DCR- 
AKL) mRNA, 
:omplete cds 


2e-34 


4105269 


(AF044574) putative 
peroxisomal 2,4-dierioyl-CoA 
reductase [Rattus norvesicus] 


6e-15 


1334 


] 

D14657 t 


Human mRNA for 
KIAA0101 gene, 
:omplete cds 


7e-35 


<NONE> 


<NONE> 


<NONE> 


1335 


1 

1 

X69910 f 


-i.sapiens p63 mRNA 
or transmembrane 
protein 


7e-35 


( 

2136323 


ri thorax homolog HTX - human 
fragment) homolog=MLL 
alternative splicing, clone I4p- 
18B } 


0.94 
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Nearest Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1336 


AF053455 


tetraspan TM4SF 
(TSPAN-5) gene, 
complete cds 


7e-35 


3152703 


(AF065389) tetraspan NET-4 
[Homo sapiens] 


le-25 


1337 


X58374 


D.melanogaster crn 
mRNA 


3e-35 


1 17478 


CROOKED NECK PROTEIN 


6e-41 


1338 


AF086492 


Homo sapiens full 
length insert cDNA 
clone ZD95D11 


9e-36 


2909809 


(AF031328) aminoglycoside 6'- 
N-acetyltransferase It 


1.9 


1339 


79622T 


H.sapiens telomeric 
DNA sequence, clone 
12PTEL120, read 
19PTFI OO 120 sea 


3e-36 


2408068 


(Z99165) hvDothetical orotein 


0.61 


1340 


Z37986 


H.sapiens mRNA for 
phenylalkylamine 
binding protein. 


le-36 


1362793 


emopamil-binding protein - 
human >gi|780263 


5e-U 






Human ribosomal 
protein S27 mRNA, 
complete cds. end 
similar to similar i«j 
metallopanstimulin 1 










1341 


U57847 


> :: 

gb|AA316327|AA316 
327 EST188061 HCC 
cell line (metastasis to 
liver in mouse) II 
Homo sapiens cDNA 
5' end similar to 
similar to 

metallopanstimulin 1 


3e-37 


1171014 


40S RIBOSOMAL PROTEIN 
S27 growth factor-inducible zinc 
finger protein MPS-1 - human 
>gi|431319 (L19739) 
metallopanstimulin [Homo 
sapiens] >gi| 1373421 (U57847) 
ribosomal protein S27 


1.4 




Y 15054 


Rattus norvegicus 
mRNA for 70 kDa 
tumor specmc 
antigen, partial 


3e-37 


3123027 


70 KD WD-REPEAT TUMOR- 
SPECIFIC ANTIGEN 
>gi|2505957|gnl|PID|e353992 
(Y 15054) /U kJJ tumor-specitic 
antigen [Rattus norvegicus] 


2e-15 


1343 


AF084205 


Rattus norvegicus 
serine/threonine 
protein kinase TAOl 
mRNA. complete cds 


3e-37 j 


3452473 


(AF084205) serine/threonine 
protein kinase TAOl [Rattus 
norvegicus 1 


5e-4~ 


1344 


X78604 


R. norvegicus 
(Sprague Daw ley) 
ARL5 mRNA for 
ARF-like protein 5 


le-37 


<NONE> 


<NONE> 


<NONE> 



4*t 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1345 


; AJ236644 


Homo sapiens 
chromosome 22 CpG 
island DNA, genomic 
Msel fragment, clone 
22CGIB49A3 , 
complete read 


le-37 


2239219 


(Z97210) hypothetical protein 


6e-05 


1346 


U09367 


Hiimnn 7inr~ rinc^r 
n.uiiiciu ciii\~ iiiigvi 

protein ZNF136 


4e-39 


2137269 


DNA-binding protein - mouse 
>gi|437444 


7e-23 


1347 


Z69649 


Human DNA 
sequence from 
cosmid L69F7B, 
Huntington's Disease 
Re°ion chromosome 
4pl6.3 contains 
Huntington Disease 
(HD) gene. 


3e-39 


3096918 


(AL023094) putative cyclase 
associated protein CAP 
[Arabidopsis thaliana] 


5.6 


1348 


AF065389 


Homo sapiens 
tetraspan NET-4 
mRNA, complete cds 


le-39 


3152703 


(AF065389) tetraspan NET-4 
[Homo sapiens] 


6e-29 


1349 


AF038172 


Homo sapiens clone 
sequence 


le-40 


1813464 


('1160883) CaoC fBacillus 
firmus] 


2.8 


1350 


Z83095 


H.sapiens Fanconi 
anaemia group A 

41, 42 and 43 


le-40 


2137870 


zinc finger protein - mouse 
(fragment) 


3e-23 


1351 




Homo sapiens 17- 
beta-hydroxysteroid 
dehydrogenase IV 
(HSD17B4) gene, 
exon 16 


le-40 


• 2842416 


(AL008730) dJ487J7.Ll 
(putative protein dJ487J7.1 
i<;nfnrm 1) THorno saDtensl 


6e-61 


1352 


AF070567 


Homo sapiens clone 
24544 beta- 
dystrobrevin mRNA, 
partial cds 


4e-41 


3133087 


(Y 157 18) dystrobrevin B DTN- 
B2 [Homo sapiens] 


7e-13 


1353 


AF006088 


Homo sapiens Arp2/3 
protein complex 
subunit p 16- Arc 
(.ARC 16) mRNA, 
complete cds 


2e-4l 


3121767 


ARP2/3 COMPLEX 16 KD 
SUBUNIT 


3e-36 


1354 


X69942 


M.musculus mRNA 
of enhancer-trap- 
locus 1 


6e-42 


2291152 


(AF01641S) No definition line 
found [Caenorhabditis elegans] 


6.4 
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! Nearest Neighbor CBIastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1355 


X87838 


H.sapiens mRNA for 
beta-catenin 


5e-42 


1373019 


(U28811) cysteine-rich 
fibroblast growth factor receptor 


8e-05 


1356 




Homo sapiens mRNA 
for KIAA0725 
protein, partial cds 




JOOZ 1/1 


(ABO 18268) KIAA0725 protein 
[Homo sapiens] 




1357 


M 34424 


Human cathepsin E 
(CTSE) gene, exon 9 
and complete cds. 


2e-42 


<NONE> 


<NONE> 


<NONE> 


HSR 


U80776 


Human EST clone 
NIB 1543 mariner 
transposon Hsmarl 
orf gene, complete 
cds 


2e-42 


2231380 


(U 8077 6) ort; encodes putative 
chimeric protein with SET 
domain in N-terminus with 
similarity to several other 
human, Drosophila, nematode 
and yeast proteins [Homo 
sapiens] 


3e-li 


1359 


U55184 


Human G protein 
Golf alpha gene, exon 
12 and complete cds 


2e-42 


3165531 


\t\Jr\Jo / ouo ) iNo ueriruuon line 
found [Caenorhabditis elegans] 


le-16 


1360 




Homo sapiens PAC 
clone DJU52D16 
irom ^vLjz j , cumpicic 
sequence [Homo 
sapiens] 


oe-4j 




(AB007407) myeloid zinc finger 
protein-2 [NIus musculus] 




1361 


ABO 18284 


Homo sapiens mRNA 
for KIAA0741 
protein, complete cds 


5e-43 


<NONE> 


<NONE> 


<NONE> 


1362 


AB011137 


nomo Sapiens niiviN/\ 
for KIAA0565 
protein, complete cds 


5e-43 


3043654 


(AB011137) KIAA0565 protein 
[Homo sapiens] 


le-07 


1363 




Human set gene, 

1_UII1[JICIC IU5. 


2e-43 




<NONE> 


<NONE> 


1364 


Z47087 


H.sapiens mRNA for 
RNA polymerase II 
elongation factor-like 
protein. 


2e-43 


1872514 


(U84404) E6-associated protein 
E6-AP/ubiquitin-protein ligase 
[Homo sapiens] >gi|2361031 
(AF01670S) E6-AP ubiquitin- 
protein ligase [Homo sapiens] 


7.2 


1365 


U27197 


Drosophila 
melanogaster pelota 
(peto) mRNA. 
complete cds 


2e-43 


1352736 


PELOTA PROTEIN >gi|973224 
(U27197) pelota [Drosophila 
melanogaster] 


le-46 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












RRPS PROTFTTV HDMOF nr, 




1366 


D8O0O7 


Human mRNA for 
KIAA0185 gene, 
partial cds 


6e-44 


2498864 


(KIAA0185) hypothetical 
protein YM9959.11Cof 
S.cerevisiae. [Homo sapiens] 


6e-09 


1367 


AF005039 


Homo ^aniens 
secretory carrier 
membrane protein 
(SCAMP3) mRNA, 
complete cds 


6e-44 


2232243 


(AF005039) secretory carrier 
membrane protein [Homo 
sapiens] 


2e-09 


1368 


X68101 


T? n(~\r\ff*ctir*i i c Tm 
xv.HUI VCglCLlIs I i ~ 

mRNA 


2e-44 


550420 


! YA Q 1 H | \ tret nAnA TMV^/Htir^T 

I^auo iui ) u i£ Ldcnc pruciuL-i 
[Rattus norvegicus] 


le-37 


1369 


AF044206 


Homo sapiens 
cyclooxygenase 
(COX-2) gene, 
promoter and exon 1 


2e-45 


2072953 


(U93565) putative p!50 [Homo 
sapiens] 


5e-06 


1370 


L48708 


Homo sapiens 
faciogenital dysplasia 

of intron 17 


8e-46 


<NONE> 


<NONE> 


<NONE> 


1371 


X15822 


Human COX VIIa-L 
mRNA for liver- 
specific cytochrome c 
oxidase (EC 1.9.3.1.) 


3e-46 


117121 


CV 1UCHROME (J OXIDASE 
POLYPEPTIDE VIIA-LIVER 
PRECURSOR 
>gi|2144370|pir||OSHU7L 
cytochrome-c oxidase (EC 
1.9.3.1) chain Vila precursor, 
hepatic - human >gi|30147 
(X15822) precursor (AA -23 to 
60) [Homo sapiens] 


5e-13 


1372 


U47323 


Mus musculus 

ct"r*r\m*il r* 1 J nrrttAin 
MJUlIltll piULClll 

mRNA, complete cds 


3e-46 


1493833 


yKji+lJ — D) iLTUInal LCii pruiCJii 

Mus musculus] 


le-48 


1373 


AF059524 


Homo sapiens 
reticulon gene family 
protein 


7e-47 


1731169 


hLYKUlrlt 1 iCAL 113.1 KJJ 
PROTEIN T28D9.7 IN 
CHROMOSOME II >gi|S61264 
(U28738) coded for by C. 
elegans cDNA yk8h5.3; coded 
for by C. elegans cDNA 
yk8h5.5; similar to C. elegans 
deg-1 and mec-4 in exon 2 
Caenorhabditis elegansl 


7.S 


1374 


AJ132583 


Homo sapiens mRNA 
for puromycin 
sensitive 
aminopeptidase, 
partial 


3e-47 


1777519 


(U39123) T cell receptor beta 
chain [Homo sapiens! 


9.7 



^51 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


UfcoCKLr 1 1UIN 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


















M97856 


Homo sapiens histone 
binding protein 
mRNA, complete cds. 


3e-47 


2645327 


(U83821) NADH 
dehydrogenase subunit 3 
[Oryzomys palustris] 


5.7 


1376 


U53220 


Human 

retinoblastoma- 
related Rb2/pl30 
gene, 5' flanking 
region and partial cds 


3e-47 


2499225 


CMP-SIALIC ACID 
TRANSPORTER CMP-sialic 
acid transporter [Cricetulus 
griseusl 


5.3 


1377 


X87870 


H.sapiens mRNA for 
hepatocyte nuclear 
factor 4a 


le-47 


728832 


! ! ! ! ALU SUBFAMILY SB 
WARNING ENTRY 


7.3 


1378 


AF060195 


Mus musculus 
proteasome regulator 
PA28 beta subunit 
gene, complete cds 


3e-48 


478681 


limb deformity protein - chicken 


0.25 


1379 


ABO 18285 


Homo sapiens mRNA 
for KIAA0742 
protein, partial cds 


le-48 


3122969 


TESTIS SPECIFIC PROTEIN 
A (ZINC FINGER PROTEIN 
TSGA) >gi|281040|pir||S28499 
probable zinc finger protein - rat 
>gi|57504 (X59993) zinc finger 
protein 


le-30 


1380 


U35032 


Human endogenous 
retrovirus clone 
c5.ll, HERV-H 
multiply spliced 
subgenomic leader, 
protease and integrase 
region mRNA, partial 
cds 


4e-49 


88558 


retroviral proteinase-like protein 
- human 


6e-05 


1381 


AB007956 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA0487 


le-49 


<NONE> 


<NONE> 


<NONE> 


1382 


D86987 


Homo sapiens mRNA 
forKIAA0214 
protein, complete cds 


le-49 


2497944 


ALPHA SCRUIN >gi|633238 
(Z38132) scruin [Limulus 
polyphemus] 

>gi|1093326|prf||2i03269A 
scrulin [Limulus sp.] 


9.7 


1383 


U25826 


Human transcription 
factor (SCI) gene, 
complete cds. 


4e-50 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus ATP- 










1384 


U46690 


dependent RNA 
helicase mRNA, 
partial cds. 


4e-50 


1335873 


(U46690) ATP-dependent RNA 
helicase [Mus musculus] 


3e-24 


1385 


AF072128 


Mus musculus 
cIaudin-2 mRNA, 
complete cds 


2e-50 


3335184 


(AF072128) claudin-2 [Mus 
musculus] 


4e-24 


1386 


AF093593 


Homo sapiens 
snRNA activating 
protein complex 
19kDa subunit 
(SNAP19) mRNA, 
complete cds 


le-50 


3668416 


(AF093593) snRNA activating 
protein complex 191cDa subunit 
[Homo sapiens] 


0.003 


1387 


U79745 


Homo sapiens 
monocarboxylate 
transporter 
homologue MCT6 
mRNA, complete cds 


le-50 


1177607 


(X92485) pval [Plasmodium 
vivax] 


2e-07 


1388 


L09647 


Rattus norvegicus 
hepatocyte nuclear 
factor 3a 


le-50 


404764 


(L10409) fork head related 
protein [Mus musculus] 


2e-21 


1389 


X61506 


Mouse E46 mRNA 
for E46 protein 


4e-51 


1 14909 


BRAIN PROTEIN E46 


le-20 


1390 


M33387 


Human debrisoquine 
4-hydroxylase 
(CYP2D8P) and 


le-51 


126296 


LINE-1 REVERSE 
TRANSCRIPTASE 
HOMOLOG protein 
Nycticebus coucang] 


5e-15 


1391 


AF019767 


Homo sapiens zinc 
finger protein (ZPR1) 
mRNA, complete cds 


4e-52 


961507 


(D63788) anchor protein, LCM 


5.9 


1392 


Z37986 


H. sapiens mRNA for 
phenylalkylamine 
binding protein. 


2e-52 


<NONE> 


<NONE> 


<NONE> 


1393 


U65416 


Human MHC class I 
molecule (MICB) 
gene, complete cds 


2e-52 


3878637 


(Z49i28) weak similarity with 
SINR protein (Swiss Prot 
accession number P06533); 
wDNA EST EMBL:T0O631 
worries from this gene; cDNA 
EST yk293dl0.5 comes from 
:his gene [Caenorhabditis 
elegans] 


8.7 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


f DESCRIPTION ' 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












beta-globm DNA-bmding 




1394 


Z57647 


H.sapiens CpG DNA, 
clone 189a6. forward 
read cpgl89a6.ftla . 


2e-52 


111187 


protein Bl, transcription factor 
PU.l - mouse >gi [200586 
(M32370) PU. 1 protein [Mus 
musculus] >gi 200972 
(M38252) transcription factor 
Pu.i [Mus musculus] 


5.8 


1395 


L13738 


Human activated 
p21cdc42Hs kinase 
(ack) mRNA. 
complete cds. 


2e-52 


2921447 


(AF037260) non-receptor 
protein tyrosine kinase Ack 
[Mus musculus] 


7e-23 


1396 


AF042379 


Homo sapiens spindle 
pole body protein 
spc97 homolog GCP2 
mRNA, complete cds 


7e-53 


2801701 


(AF042379) spindle pole body 
protein spc97 homolog GCP2 


le-16 


1397 


AF047441 


Homo sapiens RNA 
polymerase I 40kD 
subunit mRNA, 
complete cds 


6e-53 


3914807 


bNA-DlkkCTtlD MA 
POLYMERASE I 40 KD 
POLYPEPTIDE (RPA40) 
(RPA39) >gi|2266929 
(AF008442) RNA polymerase I 
subunit hRPA39 [Homo 
sapiens] 


4e-19 


1398 


AF104670 


Homo sapiens cell 
cycle protein 
(PA2G4) gene T exons 
6 through 13, and 
complete cds 


2e-53 


<NONE> 


<NONE> 


<NONE> 


1399 


J 
i 

S60754 1 


{ VNTR locus DX24, 
hypervariable tandem 
repeat cluster} 
human, Genomic, 
2991 nt] > :: 
gb|L07935|HUMVNT 
RA Homo sapiens 
■nicrosatellite VNTR 
DNA sequence. | 


2e-53 


1 

1209669 i 


(U38810) CAGR1 [Homo 
sapiens] >gi|3098420 
;AF040945) homeotic regulator 
tomolog MAB21 [Mus 
nusculus] 


4.6 | 


1400 


1 
1 

D86972 c 


-luman mRNA for 
<IAA021S gene, 
romplete cds 


le-53 


( 

3426041 [ 


AC005 168) unknown protein 
Arabidopsis thaliana] 


9.1 
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SEC 
ID 


| Nearest 
) 1 

| ACCESSIOr 


Neighbor f BlastN vs. 
1 DESCRIPTION 


Genbank) 
P VALUE 


I Nearest Neigh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
| DESCRIPTION 


rotein^ ) 
P VALUE 


1401 


J AJ236682 


Homo sapiens 
chromosome 22 CpG 
island DNA. genomic 
ati^ci u agiiicui, cionc 
22CGIB49E6 , 
complete read 


7e-54 


I 3928721 


j(AL034355) putative 
Jcytochrome oxidase subunit I 
[Streptomyces coelicolor] 


0.30 


1402 


1 AJ236682 


Homo sapiens 
chromosome 22 CpG 
island DNA oenomir 
Msel fragment, clone 
22CGIB49E6 , 

VUllipiCLC rcdu 


0eO4 


| 3928721 


(AL034355) putative 
(cytochrome oxidase subunit I 
[Streptomyces coelicolor] 


0.28 


1403 




Human histone 
(H2A.Z) mRNA, 
complete cds. 


oe-54 


1 70711 


histone H2A.F, embryonic - 
chicken 


2e-I6 


1404 


! AJ009947 


Homo sapiens mRNA 
for putative ATPase, 
partial 


6e-54 


! 3550295 


(AJ009947) putative ATPase 
[[Homo sapiens] 


3e-18 


1405 


Y08459 


B.taurus mRNA for 
novel cytoplasmic 

Ul UlCIIl 


2e-54 j 


<NONE> 


1 <NONE> 


<NONE> 


1406 


AF042384 


Homo sapiens BC-2 
protein rrtRNA, 
complete cds 


2e-54 


2828147 


(AF042384) BC-2 protein 
[Homo sapiens] 


2e-I4 


14071 


AF042379 


Homo sapiens spindle 
pole body protein 
spc97 homolog GCP2 
mRNA, complete cds 


8e-55 1 


2801701 


(AF042379) spindle pole body 
protein spc97 homolos GCP2 


2e-17 


14081 


< 

AF005355 ( 


Dryctolagus 
funiculus translation 
nitiation factor 
iIF2C mRNA, 
:omplete cds 


7e 55 


3253159 


:AF005355) translation 
nitiation fnc^mr pTF^P 




1409 


1 

I 
r 

AF008442 c 


iomo sapiens RNA 
)olymerase I subunit 
iRPA39 mRNA, 
omplete cds 


3e -55 J 


( 

3335138 l 


AF047441) RNA polymerase I 
tOkD subunit [Homo sapiens] 


3e-20 


1410 1 


1 

P 

s 

AF047441 c 


iomo sapiens RNA 
olymerase I 40kD 
ubunit mRNA, 
omplete cds 


3e-55 1 


( 

3335138 A 


AF047441) RNA polymerase I 
•OkD subunit [Homo sapiens] 


3e-20 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for 










1411 


X08004 


Rap IB protein > :: 
emb|A08693|A08693 
H.sapiens rap lb 
cDNA 


2e-55 


539995 


transforming protein rap lb - rat 
(strain Copenhagen) 


2e-18 


1412 


AF010403 


Homo sapiens ALR 
mRNA, complete cds 


2e-55 


2358285 


(AF0 10403) ALR [Homo 
sapiens] 


le-49 


1413 


M77016 


Human tropomodulin 
mRNA, complete cds. 


8e-56 


262249 


(S52010) orfl 5' of EpoR [mice, 
Peptide, 85 aa] [Mus sp.] 


0.027 


1414 


AB020633 


Homo sapiens mRNA 
for KIAA0826 
protein, partial cds 


2e-56 


<NONE> 


<NONE> 


<NONE> 


1415 


X87489 


H.sapiens genomic 
DNA (chromosome 
3;cioneNL1243D) 


2e-56 


1814029 


(U84501) cuticle collagen 
[Caenorhabditis briggsael I 


0.038 


1416 


AB007893 


Homo sapiens 
KIAA0433 mRNA, 
partial cds 


2e-56 


2887437 


(AB007893) KIAA0433 [Homo 
sapiens] 


9e-21 


1417 


X78925 ! 


H.sapiens HZF2 
mRNA for zinc finger 
protein 


ie-56 


3342002 


(AF0541S0) hematopoietic cell 
derived zinc finger protein 
Homo sapiens] 


2e-2I 


1418 


Z56281 ! 


H.sapiens mRNA for 
interferon regulatory 
factor 3 


9e-57 


2497442 


INTERFERON 
REGULATORY FACTOR 3 
factor 3 [Homo sapiens] 


2e-21 


1419 


U78772 


Homo sapiens nuclear 
VCP-Iike protein 
NVLp.l 


8e-57 


2406565 


(U68140) nuclear VCP-Iike 
protein NVLp.2 [Homo sapiens] 


5e-20 i 


1420 


D79994 


Human mRNA for 
KIAA0172 gene, 
partial cds 


3e-57 


1136404 


(D79994) similar to ankyrin of 
Chromatium vinosum. [Homo 
sapiens] 


9e-38 


1421 


AB002342 


Human mRNA for 
KIAA0344 gene, 
complete cds 


le-57 


2224629 


(AB002342) KIAA0344 [Homo 
sapiens] 


4e-20 ! 


1422 


LI 9437 


Human transaldolase 
mRNA containing 
transposable element, 
complete cds 


le-57 


1553119 


(U63159) transaldolase [Mus 
musculus] 


2e-20 


1423 


D17532 


Human mRNA for 
RCK, complete cds 


9e-58 


129376 


PROBABLE ATP- 
DEPENDENT RNA 
HELICASE P54 (ONCOGENE 
RCK) (DEAD BOX PROTEIN 
6) 


le-10 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1424 


X79568 


H.sapiens BDPI 
mRNA for protein- 
tyrosine-phosphatase 


9e-58 


1871531 


(X79568) protein- tyrosine- 
phosphatase 


Ie-22 


1425 


X79568 


H.sapiens BDPI 
mRNA for protein- 
tyrosine-phosphatase 


9e-58 


1871531 


(X79568) protein-tyrosine- 
phosphatase 


9e-23 


1426 


ABO 12295 


Homo sapiens 
HKE1.5 mRNA for 
GDS-related protein, 
complete cds 


7e-58 


2648021 


(Z97184) RGL2 [Homo sapiensl 


9e-19 


1427 


AF086040 


Homo sapiens full 
length insert cDNA 
clone YX52E07 


le-58 


543222 


gJutamine (Q)-rich factor 1 , 
QRF-1 - mouse factor 1, QRF-1 
[mice, B-cell leukemia, BCL1, 
Peptide Partial, 84 aa] 


3e-36 


1428 


AB018195 


Homo sapiens ca xi 
mRNA for carbonic 
anhydrase- related 
protein XL complete 
cds 


4e-59 


<NONE> 


<NONE> 


<NONE> 


1429 


AF071777 


Mus musculus ERE1 
(Ire 1) mRNA, 
complete cds 


4e-59 


3766209 


(AF071777) IRE1 [Mus 
musculus] 


7e-2S 


1430 


AB000462 


Homo sapiens mRNA 
for SH3 binding 
protein, complete cds, 
cione:RES4-23A 


3e-59 


<NONE> 


<NONE> 


<NONE> 


1431 


AF038172 


Homo sapiens clone 
23923 mRNA 
sequence 


3e-59 


3758855 


(Z98551) MAL3P6.11 
[Plasmodium falciparum] 


1.3 


1432 


Z84812 


Human DNA 

sequence from phage 
pTEL from a contig 
from the tip of the 
short arm of 
chromosome 16, 
spanning 2Mb of 
1 6pl 3.3 Contains 
ESTs 


le-59 


] 
] 

400927 


RIBONUCLEOPROTEIN 
RB97D ribonucleoprotein 
^Drosophila melanogaster] 


2.5 


1433 


] 
1 
I 

V 

1 

U36484 > 


Human laminin- 
^inding protein gene, 
martial cds, and E2 
>mall nucleolar RNA 
zene, complete 
equence 


le-59 


226005 f 


)rotein 40kD [Mus musculus] 


7e-05 



4 (,3 
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PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












£)UaL SpECeFICITT 




1434 


LI 1285 


Homosapiens ERK 
activator kinase 
(MEK2) mRNA. 


le-59 


i.T77UJU 


MITOGEN- ACTIVATED 
PROTEIN KINASE KINASE 2 
(MAP KINASE KINASE 2) 
(MAPKK 2) kinase type 2 


Je-2 1 


1435 


AF086555 


Homo sapiens full 
length insert cDNA 
clone ZE14E04 


4e-60 


3287674 


(AC005239) F23149_l [Homo 


^e-U4 


1436 


M24766 


Human (clone 
pHAIV2-12) alpha-2 
collagen type IV 


4e-60 


29551 


(X056I0) alpha (2) chain 
[Homo sapiens] 


6e-15 


1437 


X65550 ' 


H.sapiens mki67a 
mRNA (long type) 
for antigen of 
monoclonal antibody 
Ki-67 


4e-60 


1170654 


ANTIGEN KI-67 
>gi|539555|pir||A48666 cell 
proliferation antigen Ki-67, long 
lunn - numan is.i-0 / iriomo 
sapiens] 


3e-15 


1438 


M27319 


Human calmodulin 
mRNA, complete cds. 


4e-60 


1345451 


(X05949) Calmodulin (AA 2 - 
y+~*y ii» isi Ddse in coQonj 
[Drosophila melanosaster] 


7e-20 


1439 


Y12781 


Homo sapiens mRNA 
for transducin (beta) 
like 1 protein 


3e-60 


62133 


(XOfi 1 1">\ nut l>4 Iff) nrnf(»in 

/ —) pui. id'-* ku protein 
(AA 1 - 1187); put replicase 


7.4 


1440 


AB002383 


Human mRNA for 
KIAA0385 gene, 
complete cds 


le-60 


1001548 


(D64000) hypothetical protein 


4.4 


1441 


AF070614 


Homo sapiens clone 
24732 unknown 
mRNA, partial cds 


2e-61 


3283879 


(AF070614) unknown [Homo 
sapiens] 


3e-17 


1442 


AB002326 


Human mRNA for 
KIAA0328 gene, 
partial cds 


6e-62 


547891 


MICROTUBULE- 
ASSOCIATED PROTEIN 4 
microtubule-associated protein- 
U [Bos taurus] 


5.6 


1443 


AF086471 


Homo sapiens full 
length insert cDNA 
clone ZD88A01 


5e-62 


<NONE> 


<NONE> 


<NONE> 


1444 


1 
I 

AB002311 c 


-luman mRNA for 
<IAA0313 gene, 
-omplete cds 


2e-62 


1 
( 
( 

( 
c 
c 

2506357 c 


DIHYDROXYPHENYLPROPI 
ON ATE 1,2-DIOXYGENASE 
>gi|1657544 (U73857) similar 
o mcpl gene (catechol 2.3- 
iioxygenase) of A. eutrophus 3- 
2,3- 

iihydroxypheny Ipropionate) 1 , 2- 
iioxygenase 23- 
iihydroxyphenylpropionate 1,2- 
lioxysenase 


3.4 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearest 

> 


Neighbor (BlastN vs. ( 


3enbank) 
P VALUE 


f Nearest Neieh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1445 


1 AF069737 


Xenopus laevis 
notchiess (nle) 
mRNA, complete cds 


2e-62 


J 3687833 


(AF069737) notchiess [Xenopus 
laevis] 


le-55 


1446 


J AF044209 


Homo sapiens nuclea 
receptor co-repressor 
N-CoR mRNA, 
complete cds 


r 

5e-63 


J 2137603 


nuclear receptor co-repressor N- 
CoR - mouse musculus] 
>gi|1583865|prf]|2 12 1436A 
thyroid hormone receptor co- 
repressor [Mus musculus] 


2e-47 


1447 


M69238 


Human aryl 
hydrocarbon receptor 
nuclear translocator 
(ARNT) mRNA, 
complete cds. 


2e-63 


J 2702319 


(AF001307) aryl hydrocarbon 
receptor nuclear translocator; 
Arnt [Homo sapiens] 


5e-19 


1448 


X80497 


H.sapiens PHKLA 
mRNA 


2e-63 


1170685 


rHU^mOK ¥ LASE B 
KINASE ALPHA 
REGULATORY CHAIN, 
LIVER ISOFORM 
(PHOSPHOR YLASE KINASE 
ALPHA L SUB UNIT) 
>gi|663010 (XS0497) 
phosphorylase kinase 
phosphorylase kinase alpha 
subunit [Homo sapiens] I 


5e-22 


1 AAQ 1 


a cm i i j i 
AtUj 1141 


Homo sapiens 
ubiquitin conjugating 
enzyme 


2e-63 j 


2623260 


(AF031141) ubiquitin 
conjugating enzyme [Homo 
sapiens] 


le-23 


14501 


Z37166 ] 


H.sapiens BAT1 
mRNA for nuclear 
R.NA helicase 


6e-64 


2500529 


PROBABLE ATP- 
DEPENDENT RNA 
HELICASE P47 
>gi|2135840|pir||I3720I nuclear 
RNA helicase (DEAD family) 
BAT1 - human >gi|587146 
(Z37166) nuclear RNA helicase 
;DEAD familv) [Homo sapiens! 


9e-24 


1451 1 


J 
1 

( 
c 

k 
< 

P 

M64240 p 


hluman heiix-loop- 
neiix zipper protein 
max) mRNA, 
:omp!ete cds. > :: 
;b|I41I38|I41I38 
Sequence 1 from 
>atent US 5624818 > 
: gb|I77062|I77062 
Jequence I from 
atent US 5693487 


5e-64 | 


I 

88175 |f 


4yc-binding factor Max, short 
orm - human 


8e-22 
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sec; 

ID 


Nearest Neighbor (BlastN vs. 
) j 

ACCESSION 1 DESCRIPTION 


Genbank) 
P VALUE 


j Nearest Neieh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


rote ins) j 
P VALUE 


1452 


1 M98252 


IHomo sapiens lysyl 
hydroxylase (partial 
clone 2.2 Kb LH) 

IrNA, complete 
mature peptide. 


2e-64 


1 400205 


PROC'0LLAGi:N-LYSINE3~ 
OXOGLUTARATE 5- 
DIOXYGENASE 
PRECURSOR (LYSYL 
HYDROXYLASE) lysyl 
hydroxylase [Homo sapiens] 


7e-22 


1453 


1 U09550 


Human oviductal 
glycoprotein mRNA, 
complete cds. 


8e-65 


J 2493676 


OVIDUCT-SPECIFIC 
GLYCOPROTEIN 
PRECURSOR (OVIDUCTAL 
GLYCOPROTEIN) 
(OVIDUCTIN) 


2e 11 1 


1454 


X67877 


R.norvegicus mRNA 
for cytosolic 
resiniferatoxin- 
binding protein 


7e-65 


I 423664 


resinireratoxin-binding protein 
RBP-26, cytosolic - rat 
>gi|3 1 1660 (X67877) cytosolic 
resiniferatoxin binding protein 
RBP-26 [Rattus norvegicus] 
>gi|1093373|prfl|2103310A 
resiniferatoxin-binding protein 
[Rattus norvesicus] 


2e-40 J 


1455 


AB018254 


Homo sapiens mRNA 
for K1AA0711 
protein, complete cds 


6e-65 , 


92298 


glutamine/glutamic acid-rich 
protein 


0.9S 


14561 


J03607 


Human 40-kDa 
keratin intermediate 
filament precursor 
gene. 


3e-65 


1070608 


keratin 19, type I, cytoskeletal - 
human sapiens] 


4e 07 J 


14571 


U65896 


Human gamma- 
glutamyl carboxylase 
gene, complete cds 


2e -65 


<NONE> 


<NONE> 


<NONE> 1 


14581 


I 

Is 
Ic 

s 

U07681 n 


-luinan NAD(H)- 
pecific isocitrate 
lehydrogenase alpha 
ubunit precursor 
nRNA, complete cds. 


2e-65 I 


] 

i 

( 
I 

< 

c 

F 
s 

1708399 s 


ISULURAIL 

DEHYDROGENASE (NAD), 
VIITOCHONDRIAL SUB UNIT 
^LPHA PRECURSOR 
ISOCITRIC 

DEHYDROGENASE) fNAD+ 
SPECIFIC ICDH) 
lehydrogenase alpha chain 
>recursor - human >gi|706839 
ubunit precursor [Homo 
apiens] 


4e-26 


1459| 


h 
p 

e 

U8S080 |c 


luman zinc finger 
rotein (LD5-1) gene, 
xons 4, 5 and 6, and 
omplete cds 


2e-65 I 


( 

1373394 n 


U57796) zinc finger protein 
Homo sapiens] >2i|2306773 


2e -39 1 
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SEQ 
ID 


Nearest 
ACCESSIOI* 


Neighbor (BlastN vs. < 
j DESCRIPTION 


3enbank) 
p \/ a r t rc 


I Nearest Neieh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1460 


M96625 


Gallus domesticus 
tensin mRNA 
sequence. 


3e-66 


2134419 


tensin - chicken (fragment) 
>gi|63805 (Z 18529) tensin 
[Gallus gallus] >gi|212755 
(L06662) tensin [Gallus gallusl 


le-51 


1461 


U 13262 


Mus musculus myelir 
gene expression 
iactor (Ivthr-2) 
mRNA, partial cds. 


le-70 


f 536926 


(U13262) myelin gene 
expression factor [Mus 
musculus] 


9e-42 


1462 


U64033 


Mus musculus Tera 
(Tera) mRNA, 
complete cds 


5e-72 


f 1575505 


(U64033) Tera [Mus musculus! 


9e-34 


1463 


X78989 


M.musculus mRNA 
for testin 


6e-74 


| 1351218 


TESTIN 2 (TES2) 
[CONTAINS: TESTIN 1 


8e-31 


1464 


U64033 


Mus musculus Tera 
(.lera; mKiNA, 
complete cds 


2e-74 


1575505 


(U64033) Tera [Mus musculusl 


5e-37 


1465 


AF057365 


Canis familiaris UDP 
N-acetylglucosamine 
transporter mRNA, 
complete cds 


9e-79 [ 


3298605 


(AF057365) UDP N- 
acetylglucosamine transporter 
[Canis familiaris] 


9e-I0 


1466 


AJ006064 


Rattus norvegicus 
mRNA for coronin- 
like protein 


le-82 


3757680 


(AJ006064) coronin-like protein 
Rattus norvegicus] 


3e-62 


1467 


U91582 


Macaca fascicularis 
UDP- | 
glucuronosyltransfera 
se mRNA, complete 

cds 


4e-89 1 


140396 


KARYOGAMY PROTEIN 
KAR4 yeast (Saccharomyces 
cerevisiae) 


le-OS 


1468 


X06762 


Mouse Hox2.3 
mRNA 


3e 92 1 


123255 ; 


HOMEOBOX PROTEIN HOX- 
B7 (HOX-2C) 


9e-23 


1469 


( 
i 
I 
1 

AB016930 c 


Iricetulus griseus 
ttRNA for 

^hosphatidy Iglycerop 
losphate synthase, 
•omplete cds 


5e-94 J 


( 
I 

4159682 s 


ABO 16930) 

Phosphatidyl glycerophosphate 
ynthase [Cricetulus griseus] 


7e-34 


1470 


X74504 n 


A. musculus T10 
aRNA 


7e-97 | 


1 

1711658 1 


JER/THR-RICH PROTEIN 
"TO IN DGCR REGION 
'gi|480900|pir||S3748S gene 
TO protein - mouse 


3e-59 
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SEC 
ID 


5 Nearest 
) 1 

1 ACCESSIO* 


Neighbor (BlastN vs. ( 
1 DESCRIPTION 


jenbank) 
P VALUE 


1 Nearest Neieh 
J ACCESSION 


bor(BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1471 


U13175 


Rattus norvegicus 
clone ubclOa 

1 lOin 1 1 1 tl n /"•r~\nilifT>t'inn 

uL/ivjmuii cuiij ugaling 

enzyme (E217kB) 
mRNA, complete cds. 


3e-98 


1351345 


UUll^Ul 1 UUATTNO" 

"ENZUMb E2=T7 HD 3 

(UBIQUITIN-PROTEIN 
LIGASE) (UBIQUITIN 
UAKKltK r*K(J 1 xilN) 
(E2(17)KB 3) 
>gi|1085588|pir||S53358 
ubiquitin conjugating enzyme 
(E217kB)-rat >gi|595666 
(U 13 175) ubiquitin conjugating 
enzyme [Rattus norvegicus] 
norvegicus] >gi| 1145691 
(U39318) UbcH5C [Homo 
sapiens] 


5e-05 


1472 


S79873 


h-Iamp-2=Iysosome- 
associated membrane 
protein-2 protein-2b 
(LAMP2) mRNA, 

aiici flan vciy ipJlLCQ 

form h-lamp-2b, 
complete cds. 


e-119 


<NONE> 


<NONE> 


<NONE> 


1473 


D 13623 


Rat mRNA for p34 
protein, complete cds 


e-112 


480379 


ribosome-binding protein p34 - 
rat sp.] 


2e-05 


14741 


ABO 13357 


Mus musculus mRNA 
for 49 kDa zinc finger 
protein, complete cds 


e-136 \ 


4153886 


(AB013357) 49 kDa zinc finger 
protein 


5e-08 


14751 


] 
1 

ABO 16930 c 


Cricetulus griseus 
mRNA for 

Phosphatidylglycerop 
losphate synthase, 
:omplete cds 


e-117 


< 
] 

4159682 


ABO 16930) 

3 hosphatidy Iglycerophosphate 
>ynthase [Cricetulus griseus] 


4e-32 


1476| 


I 
i 

( 

U38253 n 


Rattus norvegicus 
nitiation factor elF- 
.B gamma subunit 
eEF-2B gamma) I 
nRNA, complete cds | 


e-103 


I 
I 

2494312 s 


rRANSLATION INITIATION 
r ACTOR EIF-2B GAMMA 
JUBUNIT (EEF-2B GDP-GTP 
EXCHANGE FACTOR) 
ubunit [Rattus norvegicus] 


3e-42 
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SEC 
ID 


Nearest 

\ 

ACCESSIOr 


Neighbor (BlastN vs. 
vj DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 

ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION j p VALUE 


1477 


! A/jOoJ 


R.norvegicus mRNA 
for histone H3.3 


e-117 


1 ' ' 

122075 


- (Hj.jQ) hutuui 113. j — fruit fly 
(Drosophila melanogaster) 
histone H3.3B - chicken 
>gi|21 19023|pir||S61218 histone 
H3.3 - fruit fly (Drosophila 
hydei) 1-136) [Oryctolagus 
cuniculus] >gi|8046 (X53822) 
Histone H3.3Q gene product 
[Drosophila melanogaster] 
>gi|51198gallus] >gi|161190 
(M17876) histone H3 [Spisula 
solidissima] >gi|211853 
(Ml 1393) histone 3.3 [Gallus 
gallus] >gi|306848 (Ml 1354) 
H3.3 histone [Homo sapiens] 
melanogaster] >gi|963031 
(X81205) histone H3.3 H3.3A 
variant [Drosophila 

melanogaster] musculus] 


le-45 J 


1478 


U32498 


Rattus norvegicus 
rsec8 mRNA, partial 
cds 


e-108 


2143962 


rsec8 - rat (fragment) 
>gi|1019441 (U32498) rsecS 
[Rattus norvesicus] 


7e-48 I 


14791 


U41736 


Mus musculus ancient 
ubiquitous 46 kDa 
protein AUP1 
precursor (Aupl) 
mRNA, complete cds 


e-146 I 


1517822 


(U41736) ancient ubiquitous 46 
kDa protein AUP46 precursor 
[Mus musculus] 


5e-49 1 


I4.S0 1 


l 


Bos taurus vacuolar 
Droton pump subunit 
SFD alpha isoform 
:SFD)mRNA, ' 
:omplete cds 


e-119 1 


• 

1 

2895578 i 


;AF04133S) vacuolar proton 
3ump subunit SFD alpha J 
soform [Bos taurus] I 


3e -49 


1481 1 


I 

\ 

AF064553 c 


vlus musculus NSD1 
>rotein mRNA, 
'Omplete cds 


e-121 1 


\ 

3329465 


AF064553) NSD1 protein [ 
Mus musculus] 


2e-50 


1482 


1 
( 

s 

AB000517 c 


tattus sp. mRNA for 
ZDP-diacylglycerol 
ynthase, complete 
ds 


e-146 


( 
k 

1517822 r 


U41736) ancient ubiquitous 46 
Da protein AUP46 precursor 
Mus musculus] j 


2e -51 


1483| 


E 

D38517 c 


4ouse mRNA for 
>hml protein, 
omplete cds 


e-118 1 


n 

2137562 n 


louse Dhm I protein - mouse 
msculus] ! 


6e-54 | 
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SEC 
ID 


1 Neares 
J J 

Iaccessioi 


t Neighbor (BiastN vs. 

M DESCRIPTION 
M.domesticus MD6 


Genbank) 
P VALUE 


| Nearest Neish 
1 ACCESSION 


ibor (BlastX vs. Non-Redundant F 

DESCRIPTION 
L;JJ(J4 repeat unit-containino 


'rote ins) 
P VALUE 


148* 
1485 


i X54352 
U57692 


mRNA 

IMus musculus N- 
1 terminal asparagine 
amidohydrolase 
(Ntanl) mRNA, 
complete cds 


e-139 
e-118 


j 1085499 
J 2498797 


protein - mouse 

PROTEIN N- TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE 
(rKUlblN NH2 -TERMINAL 
ASPARAGINE DEAMIDASE) 
(NTN-AMEDASE) (PNAD) 
(PROTEIN NH2- TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE) 
(PNAA) >gi| 1373365 (U57691) 
N- terminal asparagine 
amidohydrolase [Mus musculus; 
amidohydrolase [Mus musculus] 


le-55 
5e-57 I 


1486 


X80169 


M. musculus mRNA 
for 200 kD protein 


e-119 


1717793 


PROTEIN TSG24 (MEIOTIC 
CHECK POINT 
REGULATOR) 
>gi|1083553|pir||A55117 tss24 


9e-58 j 


14871 


I 

U57692 c 


Mus musculus N- 
erminal asparagine 
imidohydrolase 
Ntanl) mRNA, 
:ompIete cds 


e-120 1 


j 

2498797 i 


PROTEIN N- TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE 
(rKLMtllN in H2- 1 ERMINAL 
ASPARAGINE DEAMIDASE) 
(NTN-AMIDASE) (PNAD) 
(PROTEIN NH2- TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE) 
(PNAA) >gi! 1373365 (U57691) 
N-terminaJ asparagine 
amidohydrolase [Mus musculus] 
imidohydrolase TMus musculus] 


8e -58 1 


14881 


r 

|r 

U08215 r 


vlus musculus Hsp70- 
elatedNST-1 (hsr n 
nRNA, complete cds. 


e-109 1 


( 

473407 r 


U08215) NST-1 [Mus 
nusculus] 


7e 58 


1489 I 


D85926 b 


/louse mRNA for 
tay, complete cds 


e-110 J 


1944389 ( 


D85926) Rav [Mus musculusl 


2e 58 


1490 1 


r 

d 

e 

Irr 

L20427 rr 


.anus norvegicus 

ihydroxypolyprenyib 

nzoate 

lethyltransferase 
iRNA, complete cds 


e-123 1 


( 
d 
n 
d 
n 

457372 n 


L20427) 

ihydroxypolyprenylbenzoate 
lethyltransferase 
ihydroxypolyprenylbenzoate 
lethyltransferase [Rattus 
orvegicus] 


4e 59 1 


1491 1 


X56044 |fc 


I.musculus mRNA 
)r protein Htf9C 


e-121 1 


P 

3183977 rr 


<56044) protein Htf9C [Mus 
usculus] 


le-60 1 



mo 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESS TON 


I DESCRIPTION 


P VAT T rc 


ACCESSION 


DESCRIPTION 


P VALUE 












PftOTO-ONCOciENE 




1492 


<?7477a 


p59fyn(T)=OKT3- 
induced calcium 
influx regulator 


e-163 


729896 


TYROS INE-PROTEIN 
KINASE FYN (P59-FYN) 
>gi|4202 1 7|pir|| A4499 1 protein- 
tyrosine kinase (EC 2.7.1.112) 
. fyn - mouse 


8e-63 


1493 


U88873 


Mus musculus BUB2 
like protein 1 
(HBLPl)mRNA, 
complete cds 


e-123 


4099611 


(U88873) BUB2-like protein 1 
[Mus musculus] 


le-63 


1494 


U4oojZ 


Cricetulus griseus HT 
protein mRNA, 
complete cds. 


e-117 


1216486 


(U48852) HT protein 
(Cricetulus griseusl 


7e-64 


1495 


AF032667 


Rattus norvegicus 
rexo70 mRNA, 
complete cds 


e-142 


2827160 


(AF032667) rexo70 [Rattus 
norvegicus] 


5e-66 


1496 


M62722 


Chinese hamster 
phosphatidy 1 seri ne 
decarboxylase 
mRNA, 3' end. 


e-114 


118910 


r'riUSr'fiA 1 ID YL^bKlNE 

DECARBOXYLASE 

PROENZYME 

>gi|109423|pir||A38732 

phosphatidylserine 

decarboxylase (EC 4.1.1.65) - 

Chinese hamster (fragment) 


2e-67 


1497 


AF072758 


Mus musculus fatty 
acid transport protein 
3 mRNA, partial cds 


e-130 


3335567 


(AF072758) fatty acid transport 
protein 3; FATP3 [Mus 
rnusculusl 


le-67 


1498 




Rattus norvegicus 
mRNA for atypical 
PKC specific binding 
protein, complete cds 


e-1 13 


3868778 


(AB005549) atypical PKC 
specific binding protein [Rattus 
norvegicus] 


2e-69 


1499 


U57344 


Mus musculus 
homeobox protein 
Meis3 mRNA, 
complete cds 


e-I43 




HOMEOBOX PROTEIN 

MTT<\1 


oe- ! Z 


1500 


U09874 


Mus musculus SKD3 
mRNA, complete cds. 


e-142 


2493735 


SKD3 PROTEIN SKD3 [Mus 
musculus] 


le-72 


1501 


] 
i 

U72194 ( 


VIus musculus 
nuskelin mRNA, 
romplete cds 


e-148 


( 

3493462 i 


U72194) muskelin [Mus 
nusculus] 


2e-74 


1502 


I 

XS0169 f 


vl. musculus mRNA 
or 200 kD protein 


e-155 


1 

( 
I 

1717793 : 


PROTEIN TSG24 (MEIOTIC 
:heck POINT 

REGULATOR) 

>gi|1083553|pir||A55H7 ts.s24 


3e-77 
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Neighbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 
Mus musculus 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


1503 


U72194 


muskelin mRNA, 
complete cds 


e-154 


3493462 


(U72194) muskelin [Mus 
musculus] 


2e-73 


1504 


Y12836 


Cricetulus griseus 
mRNA for Zn finger 
factor 


e-146 


3150148 


(Y 12836) Zn finger factor 
[Cricetulus griseus] 


3e-83 ! 
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Table 5 



SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


29 


295 


421 


5872 


For 


mkk like kinases 


30 


31 


182 


3943 


For 


Basic region plus leucine zipper 
transcription factors 


31 


298 


397 


5625 


For 


mkk like kinases 


186 


175 


395 


7660 


For 


SH2 Domain 


187 


358 


432 


4320 


For 


Ank repeat 


196 


37 


322 


6049 


For 


mkk like kinases 


234 


23 


121 


4607 


For 


SH3 Domain 




1 1 0 


1 7? 

1 1 L 


41 


r Ul 




41 n 




101 

171 


40^6 


r ur 


i3d.MC region piub leucine zipper 
transcription factors 


1 


71 


478 


JJJO 


IvCV 


ri 1 JT ooCj /\5>bUV>lclLCU. Willi Vd.IlUU.0 

Cellular Activities 


552 


116 


288 


3930 


Rev 


Basic region plus leucine zipper 
transcription factors 


639 


157 


561 


5797 


For 


ATPases Associated with Various 
Cellular Activities 


746 


209 


427 


5379 


For 


Fibronectin type III domain 


768 


116 


288 


3930 


For 


Basic region plus leucine zipper 
transcription factors 


807 


339 


392 


3620 


For 


Zinc finger, C2H2 type 


820 


341 


406 


2930 


Rev 


EF-hand 


822 


108 


262 


4179 


For 


Basic region plus leucine zipper 
transcription factors 


836 


158 


353 


4430 


For 


Basic region plus leucine zipper 
transcription factors 


1157 


41 


444 


5279 


Rev 


protein kinase 


1192 


186 


416 


5469 


For 


Fibronectin type III domain 


1268 


238 


315 


3540 


For 


Ank repeat 


1269 


79 


240 


11640 


For 


LIM domain containing proteins 


1288 


73 


234 


3953 


For 


Basic region plus leucine zipper 
transcription factors 



?7$ 
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bbQ ID 


.Start 


Mop 


score 


Direction 


Description 


1309 


AO 


404 


alio 


tor 


LIM domain containing proteins 


1324 


294 


356 


4690 


for 


Zinc finger, C2H2 type 


1325 


1 


234 


8981 


for 


C2 domain (prot. kinase C like) 


1336 


66 


164 


6390 


for 


WD domain, G-beta repeats 


1360 


222 


377 


8686 


for 


LIM domain containing proteins 


1365 


69 


257 


5221 


for 


Basic region plus leucine zipper 
transcription factors 


1380 


42 


140 


7130 


for 


WD domain, G-beta repeats 


1386 


243 


398 


8736 


for 


LIM domain containing proteins 


1410 


222 


350 


10553 


for 


Trypsin 


1417 


8 


354 


6073 


for 


Protein Tyrosine Phosphatase 


1454 


49 


209 


3996 


for 


Basic region plus leucine zipper 
transcription factors 


1464 


4 


180 


4978 


for 


RNA recognition motif, (aka RRM, 
RBD, or RNP domain) 


1478 


54 


437 


5176 


for 


protein kinase 


1496 


241 


520 


3929 


for 


Helicases conserved C-terminal domain 


1496 


40 


612 


5187 


for 


protein kinase 


1503 


154 


216 


4870 


for 


Zinc finger, C2H2 type 


1514 


2 


252 


4662 


for 


RNA recognition motif, (aka RRM, 
RBD, or RNP domain) 


1527 


156 


212 


3520 


for 


Zinc finger, C2H2 type 


1538 


9 


635 


11087 


for 


wnt family of developmental signaling 
proteins 


1540 


289 


471 


4107 


for 


Basic region plus leucine zipper 
transcription factors 


1549 


200 


391 


4118 


for 


Basic region plus leucine zipper 
transcription factors 


1556 


163 


354 


3958 


for 


Basic region plus leucine zipper 
transcription factors 


1557 


207 


398 


4038 


for 


Basic region plus leucine zipper 
transcription factors 


1563 


107 


298 


3978 


for 


Basic region plus leucine zipper 
transcription factors 



MH4 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


1622 


180 


365 


4022 


for 


Basic region plus leucine zipper 
transcription factors 


1630 


100 


291 


3998 


for 


Basic region plus leucine zipper 
transcription factors 


1674 


196 


258 


4880 


for 


Zinc finger, C2H2 type 


1676 


9 


86 


6610 


for 


Homeobox Domain 


1677 


316 


369 


5780 


rev 


Thioredoxins 


1688 


109 


410 


17414 


for 


Ras family 


1704 


184 


372 


3977 


for 


Basic region plus leucine zipper 
transcription factors 


1707 


92 


439 


24100 


rev 


Phosphatidylinositol -specific 
phosphohpase C, Y domain 


171 1 


263 


361 


6400 


for 


WD domain, G-beta repeats 


1744 


238 


433 


10572 


rev 


Serine carboxypeptidases 


1755 


281 


O f ^7 

367 


2580 


for 


EF-hand 


1762 


236 


334 


5880 


for 


WD domain, G-beta repeats 


1779 


64 


126 


4790 


for 


Zinc finger, C2H2 type 


1801 


295 


351 


4030 


for 


Zinc finger, C2H2 type 


1804 


301 


378 


3460 


for 


Ank repeat 


1808 


36 


I6l 


4170 


for 


Basic region plus leucine zipper 
transcription factors 


1811 


184 


315 


8390 


for 


N-teiminal homology in Ets domain 


1814 


127 


294 


10770 


for 


Bromodomain (conserved sequence 
found in human, Drosophila and yeast 
proteins.) 


1 O 1 0 

1 ft I 5 


a 
y 


1 A £L 


A HA 1 

4741 


lor 


Double-stranded RNA binding motif 


i oi n 

i o iy 


2 /5 




34oU 


lor 


Ank repeat 


1 OOA 

lftzU 


123 


299 


12150 


lor 


Homeobox Domain 


1 1 

1821 


1 T7 

111 


JUi 


1 O 1 OA 

121 oU 


tor 


Homeobox Domain 


1 Q^A 


1 04 
1 54 


ZO / 


A1Hf\ 

4z /U 


ior 


Ank repeat 


1832 


18 


173 


8987 


for 


SH3 Domain 


1835 


51 


206 


8987 


for 


SH3 Domain 


1839 


224 


307 


4270 


for 


Ank repeat 


1846 


12 


398 


36700 


for 


G-protein alpha subunit 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


1909 


160 


258 


6370 


for 


WD domain, G-beta repeats 


1911 


35 


151 


9335 


for 


Zinc finger, C3HC4 type (RING finger) 


1980 


60 


197 


7917 


for 


Zinc finger, C3HC4 type (RING finger) 


2065 


253 


306 


5410 


for 


Zinc finger, CCHC class 


2135 


2 


401 


10596 


for 


ATPases Associated with Various 
Cellular Activities 


2216 


90 


179 


5380 


for 


WW/rsp5/WWP domain containing 
proteins 


2218 


127 


225 


5500 


for 


WD domain, G-beta repeats 


2281 


20 


387 


6044 


for 


Protein Tyrosine Phosphatase 


2282 


183 


353 


5136 


for 


C2 domain (prot. kinase C like) 


2286 


12 


382 


5228 


for 


protein kinase 


2310 


20 


371 


5962 


for 


Protein Tyrosine Phosphatase 


2363 


48 


211 


4132 


for 


Basic region plus leucine zipper 
transcription factors 


2424 


43 


194 


3996 


for 


Basic region plus leucine zipper 
transcription factors 


2428 


25 


350 


4675 


for 


Dual specificity phosphatase, catalytic 
domain 


2562 


18 


101 


4560 


for 


Ank repeat 


2577 


0 


311 


10295 


for 


4 transmembrane segments integral 
membrane proteins 


2591 


60 


165 


4560 


for 


SH2 Domain 


2684 


9 


461 


5759 


for 


ATPases Associated with Various 
Cellular Activities 


2826 


116 


400 


16107 


for 


DEAD and DEAH box helicases 


2859 


100 


320 


5550 


rev 


ATPases Associated with Various 
Cellular Activities 


2871 


198 


392 


9384 


for 


DEAD and DEAH box helicases 


2944 


18 


281 


10480 


for 


Calpain large subunit, domain III 


2969 


5 


387 


5976 


rev i 


protein kinase 


3015 


131 


214 


3600 


for 


Ank repeat 


3047 


191 


292 


5295 


for 


WD domain, G-beta repeats 


3081 


190 


252 


4360 


for 


Zinc finger, C2H2 type 


3108 


275 


367 


5791 


for 


WD domain, G-beta repeats 


3147 


190 


369 


4022 


for 


Basic region plus leucine zipper 
transcription factors 


3152 


129 


320 


3947 


for 


Basic region plus leucine zipper 
transcription factors 


3158 


167 


334 


4180 


for 


Basic region plus leucine zipper 
transcription factors 


3175 


14 


164 


5951 


for 


mkk like kinases 



47 <p 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


3175 


8 


112 


5968 


for 


protein kinase 


3178 


45 


386 


19398 


for 


ATPases Associated with Various 
Cellular Activities 


3183 


14 


215 


9133 


for 


4 transmembrane segments integral 
membrane proteins 


3190 


229 


390 


6089 


for 


mkk like kinases 


3190 


118 


390 


8063 


for 


protein kinase 


3193 


293 


355 


3570 


for 


Zinc finger, C2H2 type 


3195 


0 


215 


10146 


for 


4 transmembrane segments integral 
membrane proteins 


3197 


281 


343 


4490 


for 


Zinc finger, C2H2 type 


3208 


34 


256 


4190 


for 


Basic region plus leucine zipper 
transcription factors 


3258 


138 


394 


9877 


for 


Ras family 


3266 


8 


139 


9328 


for 


ATPases Associated with Various 
Cellular Activities 


3267 


97 


180 


3820 


for 


Ank repeat 


3274 


11 


187 


15442 


for 


Fork head domain, eukaryotic 
transcription factors 


3281 


15 


182 


9681 


for 


mkk like kinases 


3285 


16 


102 


4680 


for 


EF-hand 


3292 


208 


300 


5585 


for 


WD domain, G-beta repeats 


3297 


7 


153 


6100 


for 


Helicases conserved C-terminal domain 


3306 


161 


223 


4900 


for 


Zinc finger, C2H2 type 


3307 


43 


321 


8740 


for 


SH2 Domain 


3339 


94 


342 


14970 


for 


SH2 Domain 


3345 


65 


271 


12512 


for 


PDZ domain 


3351 


124 


270 


6068 


for 


Phorbol esters/diacylglycerol binding 
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Example 4 

Differential Expression of Polynucleotides of the Invention: 
Description of Libraries and Detection of Differential Expression 

5 The relative expression levels of the polynucleotides of the invention 

was assessed in several libraries prepared from various sources, including cell lines and 
patient tissue samples. Table 6 provides a summary of these libraries, including the 
shortened library name (used hereafter), the mRNA source used to prepare the cDNA 
library, the abbreviated name of the library that is used in the tables below (in quotes), 
1 0 and the approximate number of clones in the library. 



Table 6 

Description of cDNA Libraries 



Library 
(lib #) 


Description 


Number of 
Clones in 

this 
Clustering 


1 


Kml2L4 

Human Colon Cell Line, High Metastatic Potential 
(derived from Kml2C) 
"High Colon" 


307133 


2 


Kml2C 

Human Colon Cell Line, Low Metastatic Potential 
"Low Colon" 


284755 


3 


MDA-MB-231 

Human Breast Cancer Cell Line, High Metastatic Potential; 
micro-metastases in lung 
"High Breast" 


326937 


4 


MCF7 

Human Breast Cancer Cell, Non Metastatic 
"Low Breast" 


318979 


8 


MV-522 

Human Lung Cancer Cell Line, High Metastatic Potential 
"High Lung" 


223620 


9 


UCP-3 

Human Lung Cancer Cell Line, Low Metastatic Potential 
"Low Lung" 


312503 
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Library 
(lib #) 


Description 


Number of 
Clones in 

this 
Clustering 


12 


Human microvascular endothelial cells (HMEC) - Untreated 
PCR (OligodT) cDNA library 


41938 


13 


Human microvascular endothelial cells (HMEC) - 
Basic fibroblast growth factor (bFGF) treated 
PCR (OligodT) cDNA library 


42100 


14 


Human microvascular endothelial cells (HMEC) - 
Vascular endothelial growth factor (VEGF) treated 
PCR (OligodT) cDNA library 


42825 


15 


Normal Colon - UC#2 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


34285 


16 


Colon Tumor - UC#2 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


35625 


17 


Liver Metastasis from Colon Tumor of UC#2 Patient 
PCR (OligodT) cDNA library 
"High Colon Metastasis Tissue" 


36984 


18 


Normal Colon - UC#3 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


36216 


19 


Colon Tumor - UC#3 Patient 
PCR (OligodT) cDNA library 
"High Colon Tumor Tissue" 


41388 


20 


Liver Metastasis from Colon Tumor of UC#3 Patient 
PCR (OligodT) cDNA library 
"High Colon Metastasis Tissue" 


30956 


21 


GRRpz 

Human Prostate Cell Line 


164801 


22 


WOca 

Human Prostate Cancer Cell Line 


162088 



The KM12L4 and KM12C cell lines are described in Example 1 above. 
The MDA-MB-23 1 cell line was originally isolated from pleural effusions (Cailleau, J. 
Natl Cancer. Inst. (1974) 55:661), is of high metastatic potential, and forms poorly 
5 differentiated adenocarcinoma grade II in nude mice consistent with breast carcinoma. 
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The MCF7 cell line was derived from a pleural effusion of a breast adenocarcinoma and 
is non-metastatic. The MV-522 cell line is derived from a human lung carcinoma and is 
of high metastatic potential. The UCP-3 cell line is a low metastatic human lung 
carcinoma cell line; the MV-522 is a high metastatic variant of UCP-3. These cell lines 
5 are well-recognized in the art as models for the study of human breast and lung cancer 
(see, e.g., Chandrasekaran et al., Cancer Res. (1979) 39:870 (MDA-MB-23 1 and MCF- 
7); Gastpar et al., J Med Chem (1998) ¥7:4965 (MDA-MB-23 1 and MCF-7); Ranson et 
al., Br J Cancer (1998) 77:1586 (MDA-MB-231 and MCF-7); Kuang et al., Nucleic 
Acids Res (1998) 2(5:1116 (MDA-MB-231 and MCF-7); Varki et al., Int J Cancer 

10 (1987) 40A6 (UCP-3); Varki et al., Tumour Biol (1990) 77:327; (MV-522 and UCP-3); 
Varki et al., Anticancer Res. (1990) 70:637; (MV-522); Kelner et al., Anticancer Res 
(1995) 75:867 (MV-522); and Zhang et al., Anticancer Drugs (1997) 5:696 (MV522)). 
The samples of libraries 15-20 are derived from two different patients (UC#2, and 
UC#3). The bFGF-treated HMEC were prepared by incubation with bFGF at lOng/ml 

15 for 2 hrs; the VEGF-treated HMEC were prepared by incubation with 20ng/ml VEGF 
for 2 hrs. Following incubation with the respective growth factor, the cells were 
washed and lysis buffer added for RNA preparation. The GRRpz cell line refers to low 
passage (3 passages or fewer) human prostate cells, and the WOca cell line refers to low 
passage (3 passages or fewer) human prostate cancer cells. 

20 Each of the libraries is composed of a collection of cDNA clones that in 

turn are representative of the mRNAs expressed in the indicated mRNA source. In 
order to facilitate the analysis of the millions of sequences in each library, the sequences 
were assigned to clusters. The concept of "cluster of clones" is derived from a 
sorting/grouping of cDNA clones based on their hybridization pattern to a panel of 

25 roughly 300 7bp oligonucleotide probes (see Drmanac et al., Genomics (1996) 
57(1):29). Random cDNA clones from a tissue library are hybridized at moderate 
stringency to 300 7bp oligonucleotides. Each oligonucleotide has some measure of 
specific hybridization to that specific clone. The combination of 300 of these measures 
of hybridization for 300 probes equals the "hybridization signature" for a specific clone. 

30 Clones with similar sequence will have similar hybridization signatures. By developing 
a sorting/grouping algorithm to analyze these signatures, groups of clones in a library 
can be identified and brought together computationally. These groups of clones are 
termed "clusters". Depending on the stringency of the selection in the algorithm 
(similar to the stringency of hybridization in a classic library cDNA screening protocol), 

35 the "purity" of each cluster can be controlled. For example, artifacts of clustering may 

#6 6 
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occur in computational clustering just as artifacts can occur in "wet-lab" screening of a 
cDNA library with 400 bp cDNA fragments, at even the highest stringency. The 
stringency used in the implementation of cluster herein provides groups of clones that 
are in general from the same cDNA or closely related cDNAs. Closely related clones 
5 can be a result of different length clones of the same cDNA, closely related clones from 
highly related gene families, or splice variants of the same cDNA. 



determining the number of cDNA clones corresponding to the selected cluster in the 
first library (Clones in 1 st ), and the determining the number of cDNA clones 

10 corresponding to the selected cluster in the second library (Clones in 2 nd ). Differential 
expression of the selected cluster in the first library relative to the second library is 
expressed as a "ratio" of percent expression between the two libraries. In general, the 
"ratio" is calculated by: 1) calculating the percent expression of the selected cluster in 
the first library by dividing the number of clones corresponding to a selected cluster in 

15 the first library by the total number of clones analyzed from the first library; 
2) calculating the percent expression of the selected cluster in the second library by 
dividing the number of clones corresponding to a selected cluster in a second library by 
the total number of clones analyzed from the second library; 3) dividing the calculated 
percent expression from the first library by the calculated percent expression from the 

20 second library. If the "number of clones" corresponding to a selected cluster in a library 
is zero, the value is set at 1 to aid in calculation. The formula used in calculating the 
ratio takes into account the "depth" of each of the libraries being compared, i.e., the 
total number of clones analyzed in each library. 



25 expressed between two samples when the ratio value is greater than at least about 2, 
preferably greater than at least about 3, more preferably greater than at least about 5 , 
where the ratio value is calculated using the method described above. The significance 
of differential expression is determined using a z score test (Zar, Biostatistical Analysis. 
Prentice Hall, Inc., USA, "Differences between Proportions," pp 296-298 (1974)). 



Differential expression for a selected cluster was assessed by first 



In general, a polynucleotide is said to be significantly differentially 
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EXAMPLE 5 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Breast Cancer Cells Versus Low Metastatic Breast Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential breast 
cancer tissue and low metastatic breast cancer cells. Expression of these sequences in 
breast cancer can be valuable in determining diagnostic, prognostic and/or treatment 
information. For example, sequences that are highly expressed in the high metastatic 

10 potential cells can be indicative of increased expression of genes or regulatory 
sequences involved in the metastatic process. A patient sample displaying an increased 
level of one or more of these polynucleotides may thus warrant more aggressive 
treatment. In another example, sequences that display higher expression in the low 
metastatic potential cells can be associated with genes or regulatory sequences that 

15 inhibit metastasis, and thus the expression of these polynucleotides in a sample may 
warrant a more positive prognosis than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential breast cancer cells and low metastatic 
potential breast cancer cells. 

Table 7 

25 Differentially expressed polynucleotides: Higher expression in 

high metastatic potential breast cancer (lib3) relative to low metastatic 

breast cancer cells (lib4) 



SEQ ID NOs: 


Lib3 clones 


Lib4 clones 


Iib3/lib4 


472 


64 


0 


62 


1851 


6 


0 


6 


1856 


8 


0 


8 


1867 


6 


0 


6 


1872 


6 


0 


6 


1875 


12 


3 


4 


1923 


89 


22 


4 
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SEQ ID NOs: 


Lib3 clones 


Lib4 clones 


Iib3/lib4 


2118 


7 


0 


7 


2119 


7 


0 


7 


2135 


37 


13 


3 


2190 


19 


0 


19 


2193 


16 


5 


3 


2232 


12 


2 


6 


2239 


6 


0 


6 


2338 


21 


2 


10 


2378 


16 


4 


4 


2394 


6 


0 


6 


2395 


6 


0 


6 


2490 


13 


3 


4 


2505 


16 


2 


8 


2540 


8 


1 


8 


2542 


11 


1 


11 


2607 


11 


2 


5 


2640 


22 


5 


4 


2674 


8 


0 


8 


2679 


19 


0 


19 


2684 


14 


4 


3 


2707 


8 


0 


8 


2724 


9 


0 


9 


2757 


6 


0 


6 


2776 


10 


0 


10 


2804 


13 


2 


6 


2818 


6 


0 


6 


2906 


14 


0 


14 


2959 


26 


8 


3 


2964 


17 


4 


4 


2968 


6 


0 


6 


2977 


22 


3 


7 


2980 


13 


1 


13 


3010 


6 


0 


6 


3043 


10 


1 


10 


3071 


33 


12 


3 


3072 


9 


1 


9 


3095 


19 


3 


6 


3097 


11 


2 


5 


3173 


12 


2 


6 


3203 


8 


1 


8 


3210 


27 


8 


3 



««5 
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SEQ ID NOs: 


Lib3 clones 


Lib4 clones 


Iib3/lib4 


3212 


13 


1 


13 


3284 


8 


0 


8 


3288 


6 


0 


6 


3331 


14 


3 


5 


3335 


13 


1 


13 



Table 8 

Differentially expressed polynucleotides: Higher expression in 
low metastatic breast cancer cells (lib4) relative to high metastatic 
5 potential breast cancer (lib3) 



SEQ ID NOs: 


Lib 3 Clones 


Lib 4 Clones 


Iib4/lib3 


402 


0 


6 


6 


614 


3 


21 


7 


624 


0 


6 


6 


626 


0 


8 


8 


712 


0 


9 


9 


744 


0 


7 


7 


1325 


2 


29 


15 


1452 


2 


13 


7 


1880 


0 


9 


9 


1915 


0 


7 


7 


1951 


0 


6 


6 


1955 


8 


32 


4 


2015 


0 


7 


7 


2046 


0 


7 


7 


2076 


1 


22 


23 


2087 


0 


6 


6 


2124 


0 


9 


9 


2145 


0 


8 


8 


2162 


0 


6 


6 


2163 


0 


12 


12 


2164 


5 


19 


4 


2172 


2 


15 


8 


2192 


5 


16 


3 


2244 


20 


43 


2 


2266 


3 


18 


6 


2313 


24 


56 


2 


2346 


1 


13 


13 



4^ 
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SEQ ID NOs: 


Lib 3 Clones 


Lib 4 Clones 


Iib4/lib3 


2355 


0 


10 


10 


2371 


0 


6 


6 


2393 


1 


17 


17 


2404 


1 


21 


22 


2443 


0 


6 


6 


2460 


0 


11 


11 


2523 


0 


6 


6 


2575 


! l 


10 


10 


2578 


0 


6 


6 


2584 


l 


17 


17 


2590 


0 


6 


6 


2609 


1 


9 


9 


2632 


5 


24 


5 


2714 


5 


24 


5 


2728 


0 


6 


6 


2752 


1 


14 


14 


2794 


4 


15 


4 


2826 


0 


7 


7 


2987 


5 


15 


3 


3005 


1 


14 _j 


14 


3009 


20 


58 


3 


3047 


4 


17 


4 


3057 


2 


17 


9 


3075 


2 


11 


6 


3076 


0 


6 


6 


3102 


0 


6 


6 


3128 


15 


52 


4 


3132 


15 


52 


4 


3142 


0 


6 


6 


3187 


22 


49 


2 


3253 


23 


96 


4 


3282 


19 


46 


2 


3285 


20 


40 


2 


3346 


0 


9 


9 
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EXAMPLE 6 

Polynucleotides Differentially Expressed in High Metastatic Potential Lung 
Cancer Cells Versus Low Metastatic Lung Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential lung 
cancer cells and low metastatic lung cancer cells. Expression of these sequences in lung 
cancer tissue can be valuable in determining diagnostic, prognostic and/or treatment 
information. For example, sequences that are highly expressed in the high metastatic 

10 potential cells can be indicative of increased expression of genes or regulatory 
sequences involved in the metastatic process. A patient sample displaying an increased 
level of one or more of these polynucleotides may thus warrant more aggressive 
treatment. In another example, sequences that display higher expression in the low 
metastatic potential cells can be associated with genes or regulatory sequences that 

15 inhibit metastasis, and thus the expression of these polynucleotides in a sample may 
warrant a more positive prognosis than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential lung cancer cells and low metastatic 
potential lung cancer cells: 
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Table 9 

Differentially expressed polynucleotides: Higher expression in high 
metastatic potential lung cancer cells (lib8) relative to low 
metastatic lung cancer cells (lib9) 



SEO ID NO* 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


14 


10 


0 


10 


137 


5 


0 


5 


151 


5 


0 


7 


152 


9 


o 


13 


171 


6 


0 


8 


200 


10 


o 


14 


254 


5 


o 


7 


262 


5 


o 


7 


271 


5 


o 


7 


348 


6 


1 


8 


412 


*j 


o 


7 


507 


5 


o 


7 


520 


6 


o 


8 


530 


5 


o 


7 


588 


i 5 


o 


7 


623 


7 


o 


10 


637 


7 


o 


10 


660 


5 


o 


7 


678 


8 


o 


1 1 


680 


5 


o 


7 


700 


9 


2 


6 


714 


28 


13 


3 


774 


11 


0 


15 


812 


5 


0 


7 


834 


8 


2 


6 


901 


11 


2 


8 


1168 


5 


0 


7 


1333 


6 


0 


8 


1352 


5 


0 


7 


1524 


11 


1 


15 


1706 


5 


0 


7 ' 


1752 


17 


9 


3 


1768 


20 


4 


7 


1769 


5 


0 


7 


1780 


6 


0 


8 
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SEQ ID NO: 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


1781 


40 


3 


19 


1799 


6 


1 


8 


1803 


6 


1 


8 


1811 


16 


9 


2 


1884 


6 


0 


8 


1919 


8 


1 


11 


1939 


6 


0 


8 


1975 


43 


9 


7 


2024 


12 


1 


17 


2045 


! 8 


1 


11 


2060 


20 


13 


2 


2071 


16 


4 


6 


2128 


5 


0 


7 


2177 


10 


2 


7 


2181 


44 


13 


5 


2184 


11 


1 


15 


2185 


10 


4 


3 


2283 


7 


0 


10 


2311 


10 


4 


3 


2314 


10 


0 


14 


2393 


14 


6 


3 


2398 


6 


1 


8 


2460 


10 


4 


3 


2514 


6 


0 


8 


2597 


5 


0 


7 


2657 


8 


2 


6 


2669 


6 


1 


8 


2670 


6 


1 


8 


3047 


21 


3 


10 


3050 


16 


5 


4 


3092 


7 


1 


10 


3140 


181 


119 


2 


3157 


5 


0 


7 


3187 


16 


5 


4 


3210 


5 


0 


7 


3220 


28 


4 


10 


3236 


7 


1 


10 


3249 


16 


0 


22 


3264 


8 


2 


6 


3305 


7 


0 


10 


3309 


20 


0 


28 
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SEQ ID NO: 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


3318 


24 


4 


8 


3330 


5 


0 


7 


3331 


5 


0 


7 



Table 10 

Differentially expressed polynucleotides: Higher expression in low metastatic lung 
cancer cells (lib 9) relative to high metastatic potential lung cancer cells (lib 8) 



SEQ ID NO: 


Lib 8 clones 


Lib 9 clones 


lib 9/lib 8 


24 


3 


20 


5 


53 


0 


18 


13 


64 


0 


8 


6 


70 


0 


11 


8 


105 


10 


66 


5 


129 


0 


16 


1 1 


214 


1 


14 


10 


233 


4 


35 


6 


237 


0 


13 


9 


264 


0 


29 


21 


329 


2 


17 


6 


368 


1 


37 


26 


370 


0 


1 1 


8 




u 


Q 
O 


/r 
D 


450 


0 


9 


6 


461 


0 


9 


6 


484 


0 


26 


19 


494 


0 


41 


29 


517 


1 


12 


9 


522 


1 


11 


8 


581 


1 


17 


12 


614 


3 


23 


5 


706 1 


0 


11 


8 


726 


5 


23 


3 


806 


0 


14 


10 


824 


0 


9 


6 


836 


1 


14 


10 


874 


0 


12 


9 


900 


5 


21 


3 1 


1017 


2 


14 


5 
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SEQID NO: 


Lib 8 clones 


Lib 9 clones 


lib 9/lib 8 


1144 


0 


8 


6 


1154 


0 


12 


9 


1166 


2 


45 


16 


1170 


1 


13 


9 


1302 


2 


13 


5 


1326 


1 


13 


9 


1327 


1 


13 


9 


1367 


0 


12 


9 ; 


1377 


0 


12 


9 


1437 


2 


18 


6 


1442 


1 


14 


10 


1466 


0 


13 


9 


1476 


0 


13 


9 


1495 


0 


8 


6 


1496 


1 


13 


9 


1664 


38 


253 


5 


1682 


1 


17 


12 


1687 


0 


9 


6 


1758 


0 


8 


6 


1817 


4 


18 


3 


1837 


3 


16 


4 


1845 


3 


23 


5 


1856 


2 


17 


6 


1910 ; 


1 


18 j 


13 


2146 


2 


16 


9 


2156 


0 


9 


6 


2463 


0 


12 


9 


2724 


10 


38 


3 


2749 


403 


2000 


4 


2801 


6 


25 


3 


2993 


3 


18 


4 


3080 


0 


10 


7 


3107 


3 


23 


5 


3292 


0 


20 


14 


3324 


110 


548 


4 



if Ho 



WO 01/02568 



PCT/US00/18374 



EXAMPLE 7 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Colon Cancer Cells Versus Low Metastatic Colon Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential colon 
cancer cells and low metastatic colon cancer cells. Expression of these sequences in 
colon cancer tissue can provide diagnostic, prognostic and/or treatment information. 
For example, sequences that are highly expressed in the high metastatic potential cells 

1 0 can be indicative of increased expression of genes or regulatory sequences involved in 
the metastatic process. A patient sample displaying an increased level of one or more of 
these polynucleotides may thus warrant more aggressive treatment. In another example, 
sequences that display higher expression in the low metastatic potential cells can be 
associated with genes or regulatory sequences that inhibit metastasis, and thus the 

1 5 expression of these polynucleotides in a sample may warrant a more positive prognosis 
than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following table summarizes identified polynucleotides with 
differential expression between high metastatic potential colon cancer cells and low 
metastatic potential colon cancer cells: 

Table 1 1 

25 Differentially expressed polynucleotides: Higher expression in low metastatic colon 
cancer cells (lib 2) relative to high metastatic potential colon cancer cells (lib 1) 



SEQ ID NOs: 


Lib 1 clones 


Lib 2 clones 


lib 2/lib 1 


429 


0 


9 


10 


1494 


0 


8 


9 


1923 


34 


114 


4 


1986 


3 


12 


4 


2018 


0 


9 


10 


2036 


2 


10 


5 


2049 


8 


25 


3 


2135 


24 


87 


4 
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SEQ ID NOs: 


Lib 1 clones 


Lib 2 clones 


lib 2/lib 1 


2146 


2 


16 


9 


2208 


6 


27 


5 


2215 


2 


11 


6 


2239 


1 


10 


11 


2307 


2 


12 


6 


2313 


28 


62 


2 


2357 


5 


14 


3 


2360 


3 


21 


8 


2362 


0 


6 


6 


2378 


3 


12 


4 


2569 


3 


20 


7 


2571 


0 


6 


6 


2588 


54 


172 


3 


2592 


15 


41 


3 


2611 


0 


6 


6 


2636 


0 


9 


10 


2641 


7 


20 


3 


2650 


0 


9 


10 


2662 


0 


9 


10 


2674 


4 


13 


4 


2682 


0 


6 


6 


2702 


9 


25 


3 


2704 


8 


23 


3 


2715 


2 


12 


6 


2804 


9 


22 


3 


2821 


13 


29 


2 


2840 


1 


8 


9 


2846 


2 


15 


8 


2866 


0 


6 


6 


2906 


0 


6 


6 


2915 


44 


109 


3 


2933 


0 


6 


6 


2935 


5 


16 


3 


2957 


1 


1 1 


12 


2959 


3 


27 


10 


2977 


16 


30 


2 


2980 


12 


27 


2 


3000 


2 


13 


7 


3009 


12 


29 


3 


3115 


0 


7 


8 


3156 


502 


2170 


5 
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SEQ ID NOs: 


Lib 1 clones 


Lib 2 clones 


lib 2/lib 1 


3210 


2 


21 


11 


3211 


0 


9 


10 


3213 


0 


7 


8 


3235 


2 


12 


6 


3251 


2 


12 


6 


3296 


3 


12 


4 


3335 


1 


8 


9 



EXAMPLE 8 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Colon Cancer Patient Tissue Versus Normal Patient Tissue 

5 

A number of polynucleotide sequences have been identified that are 
differentially expressed between cells derived from high metastatic potential colon 
cancer tissue and normal tissue. Expression of these sequences in colon cancer tissue 
can provide diagnostic, prognostic and/or treatment information. For example, 

10 sequences that are highly expressed in the high metastatic potential cells can be 
indicative of increased expression of genes or regulatory sequences involved in the 
advanced disease state which involves processes such as angiogenesis, dedifferentiation, 
cell replication, and metastasis. A patient sample displaying an increased level of one 
or more of these polynucleotides may thus warrant more aggressive treatment. 

15 The differential expression of these polynucleotides can be used as a 

diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 
known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 

20 expressed between high metastatic potential colon cancer tissue and normal colon 
tissue: 
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Table 12 

Differentially expressed polynucleotides isolated from samples from two patients 
(patient 2 and patient 3 and) : Lower expression in high metastatic potential colon tissue 
(patient 2:lib 17; patient 3:lib 20) vs. normal colon tissue (patient 2:lib 15; patient 
5 3:lib 18) 



^tA^ ID JNVj. 


n d i j clones 


lib 17 clones 


lit) O/llD 1 / 




1 Q 
IV 


n 

I 


3 


lz3 


ii 
o 


c\ 
u 


z: 
0 


1 a n 
140 


z4 


Q 

o 


3 


iy / 


z: 
O 


U 


z: 
O 


i no 


111 
1 1 J 


U 


1 O 1 

Izl 


zj4 


oo 
zo 


y 


3 


41z 


OO 

zo 


y 


3 


Mz 


1 1 

1 1 


l 


i o 
lz 


£.A 1 
641 


1 T 
1 / 


/ 


3 


o4z 


/ 


r\ 
V 


o 

5 


yM 


1 O 

Iz 


3 


4 


i m 1 
1 (J 1 1 


zuy 


lo 


1 >l 
14 


1 UZ4 


o 
o 


U 


Q 

y 


1 U4U 


IZ 




4 


IIOj 


zo 


/ 


A 

4 


1 1 f\A. 


1 1 
J 1 


1 c 
1 _) 


Z 


1 IOC 

1 1 ZD 


1 *7 
1 / 


U 


1 Q 
1 O 


1 1 TO 
1 IZy 


1 / 


n 
u 


1 C 

1 o 


1 1 JO 


i no 
i \j j 


u 


117 
11/ 


1 ZH-^+ 




i 

i 


1 c 
1 J 


1 ?S1 




0 


7R 

/ o 


1283 j 


34 


7 


5 


1285 ; 


34 


7 


5 


1339 


13 


4 


3 


1474 


73 


o i 


78 


1505 


18 


3 


6 


1553 


68 


6 


12 


1554 


2542 


14 


195 


1605 


2542 


14 


195 


1628 ; 


6 


0 


6 


1643 ; 


142 


4 


38 


1753 : 


12 


0 


10 


1764 


13 


0 


14 
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SEQ ID NO: 


lib 15 clones 


lib 17 clones 


lib 15/lib 17 


SEQ ID NO: 


Lib 18 Clones 


Lib20 Clones 


Iibl8/lib20 


105 


28 


11 


2 


198 


21 


0 


18 


254 


9 


o 


8 


412 


9 


0 


8 


1011 


11 


1 


9 


1138 


14 


0 


12 


1253 


23 


0 


20 


1643 


18 


0 


15 


1764 


12 


0 


10 


3156 


140 


43 


3 



Table 13 

Differentially expressed polynucleotides isolated from samples from two patients 
(patient 2 and patient 3): Lower expression in normal colon tissue (patient 2:lib 15; 
5 patient 3:lib 18)vs. high metastatic potential colon tissue (patient 2:lib 17; patient 3:lib 

20). 



SEQ ID NO: 


Lib 15 Clones 


Lib 1 7 Clones 


lib 17/lib 15 


321 


3 


23 


7 


363 


1 


! 9 


8 


836 


21 


99 


4 


859 


6 


20 


3 


885 


13 


28 


2 


916 


13 


28 


2 


981 


2 


11 


5 


1226 


8 


70 


8 


1308 


0 


8 


7 


1317 


29 


84 


3 


1429 


27 


127 


4 


1442 


0 


9 


8 


1534 


1 


12 


11 


1540 


12 


43 


3 


1552 


0 


7 


7 


1556 


1 


9 


8 


1557 


1 


9 


8 


1569 


2189 


5122 


2 


1571 


6 


18 


3 


1576 


3 


25 


8 
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SEQ ID NO: 


Lib 15 Clones 


Lib 17 Clones 


lib 17/lib 15 


1581 


4 


22 


5 


1601 


25 


157 


6 


1613 


9 


48 


5 


1616 


15 


61 


4 


1620 


2 


17 


8 


1622 


4 


99 


23 


1626 


6 


35 


5 


1647 


4 


22 


5 


1664 


4 


28 


7 


1683 


2 


18 


8 


1704 


3 


15 


5 


1800 


0 


7 


7 


2749 


23 


60 


2 


2784 


4 


14 


3 


2805 


1 


9 


8 


2976 


3 


14 


4 


3128 


18 


57 


3 


3129 


26 


124 


4 j 


3146 


64 


210 


3 


3150 


940 


2267 


2 


3151 


2 


15 


7 










SEQ ID NO: 


lib 1 8 clones 


lib 20 clones 


lib 20/hb 18 


865 


0 


5 


6 


1569 


1 


7 


8 


1580 


1 


7 


8 


1590 


1 


7 


8 


2790 


0 


5 


6 



EXAMPLE 9 

Polynucleotides Differentially Expressed in High Colon Tumor Potential 
Patient Tissue Versus Metastasized Colon Cancer Patient Tissue 
5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from colon cancer tissue and cells derived 
from colon cancer tissue metastases to liver. Expression of these sequences in colon 
cancer tissue can provide diagnostic, prognostic and/or treatment information associated 
with the transformation of precancerous tissue to malignant tissue. This information 
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can be useful in the prevention of achieving the advanced malignant state in these 
tissues, and can be important in risk assessment for a patient. 

The following table summarizes identified polynucleotides with 
differential expression between high tumor potential colon cancer tissue and cells 
5 derived from high metastatic potential colon cancer cells: 



Table 14 

Differentially expressed polynucleotides: 
Greater expression in metastatic colon tumor tissue (lib 20) vs. 
1 0 colon tumor tissue (lib 1 9) 



SEQ ID NO: 


lib 1 9 clones 


lib 20 clones 


lib 20/lib 19 


937 


0 


6 


8 


976 


0 


5 


7 


1520 


1 


8 


11 


1546 


1 


11 


15 


1550 


1 


11 


15 


1574 


1 


8 


11 


1580 


0 


7 


9 


1590 


0 


7 


9 


1599 


8 


21 


4 


1607 


158 


632 


5 


1622 


1 


7 


9 



Table 15 

Greater expression in colon tumor tissue (lib 19) than metastatic colon tissue (lib 20) 



SEQ ID NO: 


lib 1 9 clones 


lib 20 clones 


lib 19/lib 20 


105 


64 


11 


4 


1011 


53 


1 


40 


1226 


18 


4 


3 


1571 


8 


0 


6 


1726 


15 


3 


4 


1811 


17 


2 


6 j 


2749 


47 


6 


6 


3146 


19 


2 


7 


3324 


20 


1 


15 
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EXAMPLE 10 

Polynucleotides Differentially Expressed in High Tumor Potential 
Colon Cancer Patient Tissue Versus Normal Patient Tissue 
5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high tumor potential colon cancer 
tissue and normal tissue. Expression of these sequences in colon cancer tissue can 
provide diagnostic, prognostic and/or treatment information associated with the 
prevention of the malignant state in these tissues, and can be important in risk 
10 assessment for a patient. For example, sequences that are highly expressed in the 
potential colon cancer cells are associated with or can be indicative of increased 
expression of genes or regulatory sequences involved in early tumor progression. A 
patient sample displaying an increased level of one or more of these polynucleotides 
may thus warrant closer attention or more frequent screening procedures to catch the 
15 malignant state as early as possible. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential colon cancer cells and normal colon cells: 

Table 16 

Differentially expressed polynucleotides detected in samples from patient (patient 2) 
20 Higher expression in normal colon tissue (patient 2, lib 15) 

vs. tumor potential colon tissue (patient 2:libl6) 



SEQ ID NO: 


lib 1 5 clones 


lib 1 6 clones 


lib 16/lib 15 


69 


19 


7 


3 


105 


116 


54 


2 


140 


24 


4 


6 


197 


6 


0 


6 


198 


113 


3 


40 


254 


28 


6 


5 


412 


28 


6 


5 


642 


7 


0 


7 


830 


10 


2 


5 


938 


31 


13 


3 


1011 


209 


37 


6 


1095 


12 


3 


4 


1125 


17 


0 


18 
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SEQ ID NO: 


lib 15 clones 


lib 16 clones 


lib 16/lib 15 


1129 


17 


0 


18 


1138 


109 


1 


115 


1253 


73 


1 


77 


1283 


34 


13 


3 


1285 


34 


13 


3 


1339 


13 


3 


5 


1453 


11 


3 


4 


1474 


73 


1 


77 


1505 


18 


6 


3 


1554 


2542 


448 


6 


1605 


2542 


448 


6 


1614 


36 


14 


3 


1630 


24 


9 


3 


1643 


142 


2 


75 


1646 


39 


14 


3 


1649 


24 


8 


3 


1677 


19 


6 


3 


1753 


13 


0 


14 


1764 


13 


0 


14 


1766 


177 


65 


3 


1772 


24 


8 


3 



Table 17 

Differentially expressed polypeptides detected in samples from patient. Lower 
expression in normal colon tissue (lib 18) than colon tumor tissue (lib 19) 



SEQ ID NO: 


lib 1 8 clones 


lib 19 clones 


lib 19/lib 18 


3146 


3 


19 


6 


3150 


21 


228 


10 


3324 


3 


20 


6 



